-
Notifications
You must be signed in to change notification settings - Fork 342
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Low Linear-Prob accuracy #65
Comments
I got similar results. MAE+ViT-B+400ep: the linear probing top1-accuracy is 53.01. optimizer adamw |
Thanks for your sharing. |
Sorry, no idea. |
Yeah, I use your linear-prob method and may get +0.33% when testing mae-large model. However, it is still much lower than expected. |
I also tried to reproduce the linear probe results with no success. Interestingly, when I tried the non normalized loss during pretraining, the linear probe accuracy for the base config increased to 60% (still much lower than the expected 68%). With the normalized loss I also got 53.9% accuracy as you. Were you able to reproduce the linear probe results lately? |
Hi, @launchauto, @michuanhaohao , @mts42000 Thanks for your efforts in reproducing the linear probe results. I noticed that the official MAE repo has released the linear probe code. Thus, it is not hard to reproduce. However, I was wondering did you find what caused the inconsistent performance compared with your original reproduction? I think there is not much difference between your configuration and the official configuration. However, the performance gap is very large. Any help would be appreciated. |
Dear author
I have reproduced your code using 64 V100 GPUs. Every setting is the same as paper (batch size 4096), The end-to-end finetuning is almost the same as paper. However, the linear prob is lower than expected in the paper. All of the experiments use normalized targets.
By the way, I used MoCo V3 2d position-embedding to replace 1d sin-cos position-embedding, which may help(MAE Vit-Base+0.3% in e2e finetuning and linear probing). MoCo V3
I also test your 400 epochs open MAE-ViT-Base model, the linear probing top1-accuracy is 50.91.
Did I miss something mentioned details in the paper?
For the parameters used in linear probing, I followed the setting in the appendix of the paper.
optimizer LARS lr=6.4 batchsize=16384 weight_decay=0, momentum=0.9, cosine decay
warmup epochs=10, total training epochs=90, only use random resize and crop as data augmentation
Replaced the last layer norm with the Batch norm(affine=False) before the classifier.
During the linear probing, I have frozen the backbone, only updating the fc+norm+mean pooling in the head of the classifier.
The text was updated successfully, but these errors were encountered: