The Marginal Value of Adaptive Gradient Methods in Machine Learning

NeurIPS 2017 Ashia C. WilsonRebecca RoelofsMitchell SternNathan SrebroBenjamin Recht

Adaptive optimization methods, which perform local optimization with a metric constructed from the history of iterates, are becoming increasingly popular for training deep neural networks. Examples include AdaGrad, RMSProp, and Adam... (read more)

PDF Abstract NeurIPS 2017 PDF NeurIPS 2017 Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper