Adaptively Preconditioned Stochastic Gradient Langevin Dynamics

10 Jun 2019Chandrasekaran Anirudh Bhardwaj

Stochastic Gradient Langevin Dynamics infuses isotropic gradient noise to SGD to help navigate pathological curvature in the loss landscape for deep networks. Isotropic nature of the noise leads to poor scaling, and adaptive methods based on higher order curvature information such as Fisher Scoring have been proposed to precondition the noise in order to achieve better convergence... (read more)

PDF Abstract

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper