You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Recent results show that while adaptive optimization methods do obtain better minima in the trainining loss function, they are more prone to overfitting, specially in network with more parameters than training data, which might well be the case here.
To avoid this it would be useful to SGD and SGD+Nesterov as optimizers in the hypertuning procedure.
Recent results show that while adaptive optimization methods do obtain better minima in the trainining loss function, they are more prone to overfitting, specially in network with more parameters than training data, which might well be the case here.
To avoid this it would be useful to SGD and SGD+Nesterov as optimizers in the hypertuning procedure.
The text was updated successfully, but these errors were encountered: