Towards Deeper Understanding of Nonconvex Stochastic Optimization with Momentum using Diffusion Approximations

14 Feb 2018 Tianyi Liu Zhehui Chen Enlu Zhou Tuo Zhao

Momentum Stochastic Gradient Descent (MSGD) algorithm has been widely applied to many nonconvex optimization problems in machine learning, e.g., training deep neural networks, variational Bayesian inference, and etc. Due to current technical limit, however, establishing convergence properties of MSGD for these highly complicated nonconvex problems is generally infeasible... (read more)

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper


METHOD TYPE
PCA
Dimensionality Reduction