You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Thank you for sharing your research work. I am trying to reproduce the classification results of efficientformer_l1, I followed the instruction in your repo by running the following command:
sh dist_train.sh efficientformer_l1 8
However, the reproduced accuracy after training for the network on the 50000 test images: is 78.0% for 300 epochs. The result you mentioned in the paper is 79.2%. Could you please comment on that and let me know the reason?
I hope to hear from you soon,
Abdelrahman.
The text was updated successfully, but these errors were encountered:
In most of our experiments, we use 16gpu x 128bs to 16gpu x 256bs for training. We do notice that it is not easy to achieve best results when batch size is small (say 8x128). So there are two possible solutions,
In our latest EfficientFormerV2 release, we add syncronized normalization, which may help to achieve more stable training results with small batch size.
Use a larger batch size if there's sufficient VRAM.
@liyy201912 Thank you for your reply and your clarification, I will check that.
I have another question please, can you provide more details about the Fast Latency-Driven Slimming algorithm? How many GPUs do you need to train it with as well as what is the running time for that? Is this algorithm open source or do you plan to release it soon?
Hi,
Thank you for sharing your research work. I am trying to reproduce the classification results of efficientformer_l1, I followed the instruction in your repo by running the following command:
sh dist_train.sh efficientformer_l1 8
However, the reproduced accuracy after training for the network on the 50000 test images: is 78.0% for 300 epochs. The result you mentioned in the paper is 79.2%. Could you please comment on that and let me know the reason?
I hope to hear from you soon,
Abdelrahman.
The text was updated successfully, but these errors were encountered: