Unstable Deep Learning model #15515

hasithjp · 2023-05-24T12:30:40Z

hasithjp
May 24, 2023
Maintainer

Description

If your Deep Learning models aborts with the following message, then there was a numerical instability:

Got exception 'class java.lang.RuntimeException', with msg 'water.DException$DistributedException: from /127.0.0.1:54321; by class hex.Model$BigScore; class java.lang.UnsupportedOperationException: 

Trying to predict with an unstable model.
Job was aborted due to observed numerical instability (exponential growth).
Either the weights or the bias values are unreasonably large or lead to large activation values.
Try a different initial distribution, a bounded activation function (Tanh), adding regularization
(via max_w2, l1, l2, dropout) or learning rate (either enable adaptive_rate or use a smaller learning rate or faster annealing).

For more information visit:
  http://jira.h2o.ai/browse/TN-4

The reason:
Certain activation functions (Rectifier/RectifierWithDropout/Maxout/MaxoutWithDropout) are unbounded and can lead to a numerical explosion (cascade of multiplication of large numbers).

The solution:
"Try a different initial distribution, a bounded activation function or adding regularization with L1, L2 or max_w2 and/or use a smaller learning rate or faster annealing"

In my experience, it is almost always a good idea to add max_w2 = 10 and l1=1e-5 to the model parameters.

If that doesn't help, double check your settings for the learning rate or use adaptive learning rate.

If that doesn't help, use the "Tanh" or "TanhWithDropout" activation function.

If that doesn't help, check the "initial_weight_distribution" (try "UniformAdaptive") or reduce the "initial_weight_scale" to << 1.

If that doesn't help, please contact us at [email protected]

Please send us your dataset if you encounter this problem for default arguments: [email protected]

JIRA Issue Migration Info

Jira Issue: TN-4
Assignee: Arno Candel
Reporter: Arno Candel
State: Closed
Relates to: #14962

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unstable Deep Learning model #15515

{{title}}

Replies: 0 comments

Select a reply

Unstable Deep Learning model #15515

hasithjp May 24, 2023 Maintainer

Description

Replies: 0 comments

hasithjp
May 24, 2023
Maintainer