On a Conjecture Regarding the Adam Optimizer
Why does the Adam optimizer work so well in deep-learning applications? Adam's originators, Kingma and Ba, presented a mathematical argument that was meant to help explain its success, but Bock and colleagues have since reported that a key piece is missing from that argument $-$ an unproven lemma which we will call Bock's conjecture. Here we show that this conjecture is false, but we prove a modified version of it $-$ a generalization of a result of Reddi and colleagues $-$ which can take its place in analyses of Adam.
PDF AbstractTasks
Datasets
Add Datasets
introduced or used in this paper
Results from the Paper
Submit
results from this paper
to get state-of-the-art GitHub badges and help the
community compare results to other papers.