5 Mar 2020Alexandre DéfossezLéon BottouFrancis BachNicolas Usunier

We provide a simple proof of the convergence of the optimization algorithms Adam and Adagrad with the assumptions of smooth gradients and almost sure uniform bound on the $\ell_\infty$ norm of the gradients. This work builds on the techniques introduced by Ward et al. (2019) and extends them to the Adam optimizer... (read more)

PDF Abstract