no code implementations • NeurIPS Workshop DL-IG 2020 • Juntang Zhuang, Tommy Tang, Sekhar Tatikonda, Nicha C Dvornek, Yifan Ding, Xenophon Papademetris, James S Duncan
We propose AdaBelief optimizer to simultaneously achieve three goals: fast convergence as in adaptive methods, good generalization as in SGD, and training stability.