Search Results for author: Wu Lin

Found 13 papers, 8 papers with code

Can We Remove the Square-Root in Adaptive Gradient Methods? A Second-Order Perspective

no code implementations • 5 Feb 2024 • Wu Lin, Felix Dangel, Runa Eschenhagen, Juhan Bae, Richard E. Turner, Alireza Makhzani

Adaptive gradient optimizers like Adam(W) are the default training algorithms for many deep learning architectures, such as transformers.

Second-order methods

Paper
Add Code

Structured Inverse-Free Natural Gradient: Memory-Efficient & Numerically-Stable KFAC for Large Neural Nets

1 code implementation • 9 Dec 2023 • Wu Lin, Felix Dangel, Runa Eschenhagen, Kirill Neklyudov, Agustinus Kristiadi, Richard E. Turner, Alireza Makhzani

Second-order methods for deep learning -- such as KFAC -- can be useful for neural net training.

Second-order methods

Paper
Code

Simplifying Momentum-based Positive-definite Submanifold Optimization with Applications to Deep Learning

1 code implementation • 20 Feb 2023 • Wu Lin, Valentin Duruisseaux, Melvin Leok, Frank Nielsen, Mohammad Emtiyaz Khan, Mark Schmidt

Riemannian submanifold optimization with momentum is computationally challenging because, to ensure that the iterates remain on the submanifold, we often need to solve difficult differential equations.

Paper
Code

Structured second-order methods via natural gradient descent

no code implementations • 22 Jul 2021 • Wu Lin, Frank Nielsen, Mohammad Emtiyaz Khan, Mark Schmidt

In this paper, we propose new structured second-order methods and structured adaptive-gradient methods obtained by performing natural-gradient descent on structured parameter spaces.

Second-order methods

Paper
Add Code

Tractable structured natural gradient descent using local parameterizations

no code implementations • 15 Feb 2021 • Wu Lin, Frank Nielsen, Mohammad Emtiyaz Khan, Mark Schmidt

Natural-gradient descent (NGD) on structured parameter spaces (e. g., low-rank covariances) is computationally challenging due to difficult Fisher-matrix computations.

Variational Inference

Paper
Add Code

Handling the Positive-Definite Constraint in the Bayesian Learning Rule

1 code implementation • ICML 2020 • Wu Lin, Mark Schmidt, Mohammad Emtiyaz Khan

The Bayesian learning rule is a natural-gradient variational inference method, which not only contains many existing learning algorithms as special cases but also enables the design of new algorithms.

valid Variational Inference

Paper
Code

Stein's Lemma for the Reparameterization Trick with Exponential Family Mixtures

1 code implementation • 29 Oct 2019 • Wu Lin, Mohammad Emtiyaz Khan, Mark Schmidt

Our generalization enables us to establish a connection between Stein's lemma and the reparamterization trick to derive gradients of expectations of a large class of functions under weak assumptions.

LEMMA

Paper
Code

Fast and Simple Natural-Gradient Variational Inference with Mixture of Exponential-family Approximations

1 code implementation • 7 Jun 2019 • Wu Lin, Mohammad Emtiyaz Khan, Mark Schmidt

Natural-gradient methods enable fast and simple algorithms for variational inference, but due to computational difficulties, their use is mostly limited to \emph{minimal} exponential-family (EF) approximations.

Bayesian Inference Variational Inference

Paper
Code

Fast and Scalable Bayesian Deep Learning by Weight-Perturbation in Adam

3 code implementations • ICML 2018 • Mohammad Emtiyaz Khan, Didrik Nielsen, Voot Tangkaratt, Wu Lin, Yarin Gal, Akash Srivastava

Uncertainty computation in deep learning is essential to design robust and reliable systems.

Stochastic Optimization Variational Inference

109

Paper
Code

Variational Message Passing with Structured Inference Networks

1 code implementation • ICLR 2018 • Wu Lin, Nicolas Hubacher, Mohammad Emtiyaz Khan

Recent efforts on combining deep models with probabilistic graphical models are promising in providing flexible models that are also easy to interpret.

Variational Inference

Paper
Code

Variational Adaptive-Newton Method for Explorative Learning

no code implementations • 15 Nov 2017 • Mohammad Emtiyaz Khan, Wu Lin, Voot Tangkaratt, Zuozhu Liu, Didrik Nielsen

We present the Variational Adaptive Newton (VAN) method which is a black-box optimization method especially suitable for explorative-learning tasks such as active learning and reinforcement learning.

Active Learning reinforcement-learning +2

Paper
Add Code

Conjugate-Computation Variational Inference : Converting Variational Inference in Non-Conjugate Models to Inferences in Conjugate Models

2 code implementations • 13 Mar 2017 • Mohammad Emtiyaz Khan, Wu Lin

In this paper, we propose a new algorithm called Conjugate-computation Variational Inference (CVI) which brings the best of the two worlds together -- it uses conjugate computations for the conjugate terms and employs stochastic gradients for the rest.

Variational Inference

Paper
Code

Faster Stochastic Variational Inference using Proximal-Gradient Methods with General Divergence Functions

no code implementations • 31 Oct 2015 • Mohammad Emtiyaz Khan, Reza Babanezhad, Wu Lin, Mark Schmidt, Masashi Sugiyama

We also give a convergence-rate analysis of our method and many other previous methods which exploit the geometry of the space.

Variational Inference

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.