Name		Name	Last commit message	Last commit date
parent directory ..
code		code
doc		doc
tex		tex
README.md		README.md
README.tex.md		README.tex.md

README.md

AMSGrad

The motivation for AMSGrad lies with the observation that Adam fails to converge to an optimal solution for some data-sets and is outperformed by SDG with momentum.

Reddi et al. (2018) [1] show that one cause of the issue described above is the use of the exponential moving average of the past squared gradients.

To fix the above-described behavior, the authors propose a new algorithm called AMSGrad that keeps a running maximum of the squared gradients instead of an exponential moving average.