SGD+Nesterov optimizer #13

albarji · 2018-03-03T12:12:39Z

Recent results show that while adaptive optimization methods do obtain better minima in the trainining loss function, they are more prone to overfitting, specially in network with more parameters than training data, which might well be the case here.

To avoid this it would be useful to SGD and SGD+Nesterov as optimizers in the hypertuning procedure.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SGD+Nesterov optimizer #13

SGD+Nesterov optimizer #13

albarji commented Mar 3, 2018

SGD+Nesterov optimizer #13

SGD+Nesterov optimizer #13

Comments

albarji commented Mar 3, 2018