Unstable Deep Learning model #15515
Unanswered
hasithjp
asked this question in
Technical Notes
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Description
If your Deep Learning models aborts with the following message, then there was a numerical instability:
The reason:
Certain activation functions (Rectifier/RectifierWithDropout/Maxout/MaxoutWithDropout) are unbounded and can lead to a numerical explosion (cascade of multiplication of large numbers).
The solution:
"Try a different initial distribution, a bounded activation function or adding regularization with L1, L2 or max_w2 and/or use a smaller learning rate or faster annealing"
In my experience, it is almost always a good idea to add max_w2 = 10 and l1=1e-5 to the model parameters.
If that doesn't help, double check your settings for the learning rate or use adaptive learning rate.
If that doesn't help, use the "Tanh" or "TanhWithDropout" activation function.
If that doesn't help, check the "initial_weight_distribution" (try "UniformAdaptive") or reduce the "initial_weight_scale" to << 1.
If that doesn't help, please contact us at [email protected]
Please send us your dataset if you encounter this problem for default arguments: [email protected]
JIRA Issue Migration Info
Jira Issue: TN-4
Assignee: Arno Candel
Reporter: Arno Candel
State: Closed
Relates to: #14962
Beta Was this translation helpful? Give feedback.
All reactions