-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fine-tune with provided solver leads to loss=87.3365 #75
Comments
reduce lr to 0.001 or 0.0001, then have a try |
@zhuochen24 How do you solve it? |
@shicai @OceanWong1991 @zhuochen24 Hi, I meet the same problem , I have try to reduce lr , but nothing changed. I find the debug log abnormal ,the log as following: I0428 10:34:09.415983 2335 parallel.cpp:391] GPUs pairs 0:1 As you can see , the output in conv1 are unusual , nan or inf , so I try to use other the weight_filler's type , nothing changed. |
@zhuochen24 ,hi, have you solved it? I met the same problem. |
@feitiandemiaomi @shaqing I solved it by setting batchnorm to false |
I met the same problem. |
I am using the solver provided here @#9 with base learning rate reduced to 0.01 to fine-tune mobilenet v1 on ImageNet, and the loss goes up to 87.3365 in the first 20 iterations. I am wondering if you have any intuition on why this happens?
Thanks!
The text was updated successfully, but these errors were encountered: