A First-Order Method for Estimating Natural Gradients for Variational Inference with Gaussians and Gaussian Mixture Models

29 Sep 2021  ·  Oleg Arenz, Zihan Ye, Philipp Dahlinger, Gerhard Neumann ·

Variational inference with full-covariance Gaussian approximations is an important line of research, as such Gaussian variational approximations (GVAs) allow for tractable approximate inference while yielding superior approximations compared to mean-field methods. Moreover, it was recently shown, that the problem of variational inference with Gaussian mixture models can be reduced to Gaussian variational inference using VIPS, which is a procedure similar to expectation maximization. Effective approaches for Gaussian variational inference are MORE, VOGN, and VON, which are zero-order, first-order, and second-order, respectively. We focus on the first-order setting, which is arguably the most relevant for variational inference, and show that the biases added by the generalized Gauß-Newton approximation, which is applied by VOGN, can seriously affect the quality of the learned approximation. Hence, we propose gMORE, a method that is similar to MORE but differs by incorporating gradient information. GradientMORE achieves unbiased high-quality approximations of the Hessian that are similar to VON which has direct access to the Hessian. Our algorithm converges even in settings where VOGN does not converge. Compared to MORE, the additional information improves sample efficiency by about an order of magnitude. Furthermore, we evaluate the different approaches in the GMM setting by modifying VIPS, which has previously only been tested in combination with MORE, and show that the results from the GVA setting are transferable to GMMs, setting a new standard for GMM-based variational inference.

PDF Abstract
No code implementations yet. Submit your code now

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods