-
Notifications
You must be signed in to change notification settings - Fork 243
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Training from scratch #1
Comments
Yes, I've trained the ResNet-50 from scratch, and it does not achieve MSRA's accuracy due to a few differences. This was noted in the README. If you figure out changes to reproduce MSRA's accuracy, please let me know and we can fix it :) |
@antingshen Hi, I have used this code to train ResNet-18 from scratch, and it did not achieve a good result, too. I found the training accuracy is higher than the test after about 6-10 echoes, which is abnorm. I still trained one model without BN layers which can achieve 65% top1 accuracy, better than models using BN layers. I change your code to train models on cifar10. I achieve 88.76% top1 accuracy vs 90.0 in He's paper, which seems correct. So I am very confused.Could you tell me what your training process is and your accuracy? |
Maybe, I think we might need a bit more detail or experimentation to find out the exact BN implementation. I'm happy to cooperate. Let me know if you have any ideas. |
Could you use a modified data_reader.cpp https://github.com/lim0606/caffe-googlenet-bn for shuffering data during training? I found this can improve the accuracy for googlenet. I wonder whether it could improve resnet. |
The link is broken, but I think we want shuffling + random resize + random crop, all on the fly during training. Or at least it seems like it from the MSRA paper. I'd say modifying I have WeChat & Messenger. |
could somebody share resnet-18 model pre-trained on image net? |
Have you ever trained a model using your code? I tried to train a new model, but did not achieve the accuracy.
The text was updated successfully, but these errors were encountered: