no code implementations • 14 Sep 2024 • Valentin De Bortoli, Iryna Korshunova, andriy mnih, Arnaud Doucet
Mass transport problems arise in many areas of machine learning whereby one wants to compute a map transporting one distribution to another.
1 code implementation • 28 Sep 2022 • Tomas Geffner, George Papamakarios, andriy mnih
Neural Posterior Estimation methods for simulation-based inference can be ill-suited for dealing with posterior distributions obtained by conditioning on multiple observations, as they tend to require a large number of simulator calls to learn accurate approximations.
no code implementations • NeurIPS Workshop ICBINB 2021 • Wouter Kool, Chris J. Maddison, andriy mnih
Training large-scale mixture of experts models efficiently on modern hardware requires assigning datapoints in a batch to different experts, each with a limited capacity.
no code implementations • NeurIPS 2021 • Zhe Dong, andriy mnih, George Tucker
Training models with discrete latent variables is challenging due to the high variance of unbiased gradient estimators.
no code implementations • 26 Jan 2021 • Matthias Bauer, andriy mnih
Efficient low-variance gradient estimation enabled by the reparameterization trick (RT) has been essential to the success of variational autoencoders.
no code implementations • pproximateinference AABI Symposium 2021 • Matthias Bauer, andriy mnih
Efficient low-variance gradient estimation enabled by the reparameterization trick (RT) has been essential to the success of variational autoencoders.
no code implementations • NeurIPS 2020 • Zhe Dong, andriy mnih, George Tucker
Applying antithetic sampling over the augmenting variables yields a relatively low-variance and unbiased estimator applicable to any model with binary latent variables.
no code implementations • 8 Jun 2020 • Hyunjik Kim, George Papamakarios, Andriy Mnih
Lipschitz constants of neural networks have been explored in various contexts in deep learning, such as provable adversarial robustness, estimating Wasserstein distance, stabilising training of GANs, and formulating invertible neural networks.
no code implementations • 22 Jan 2020 • Tom Van de Wiele, David Warde-Farley, andriy mnih, Volodymyr Mnih
Applying Q-learning to high-dimensional or continuous action spaces can be difficult due to the required maximization over the set of possible actions.
1 code implementation • pproximateinference AABI Symposium 2019 • Jiaxin Shi, Michalis K. Titsias, andriy mnih
We introduce a new interpretation of sparse variational approximations for Gaussian processes using inducing points, which can lead to more scalable algorithms than previous methods.
2 code implementations • 25 Jun 2019 • Shakir Mohamed, Mihaela Rosca, Michael Figurnov, andriy mnih
This paper is a broad and accessible survey of the methods we have at our disposal for Monte Carlo gradient estimation in machine learning and across the statistical sciences: the problem of computing the gradient of an expectation of a function with respect to parameters defining the distribution that is integrated; the problem of sensitivity analysis.
no code implementations • ICLR 2019 • catalin ionescu, tejas kulkarni, aaron van de oord, andriy mnih, Vlad Mnih
Exploration in environments with sparse rewards is a key challenge for reinforcement learning.
8 code implementations • ICLR 2019 • Hyunjik Kim, andriy mnih, Jonathan Schwarz, Marta Garnelo, Ali Eslami, Dan Rosenbaum, Oriol Vinyals, Yee Whye Teh
Neural Processes (NPs) (Garnelo et al 2018a;b) approach regression by learning to map a context set of observed input-output pairs to a distribution over regression functions.
no code implementations • 26 Oct 2018 • Matthias Bauer, andriy mnih
We propose Learned Accept/Reject Sampling (LARS), a method for constructing richer priors using rejection sampling with a learned acceptance function.
2 code implementations • NeurIPS 2018 • Michael Figurnov, Shakir Mohamed, andriy mnih
By providing a simple and efficient way of computing low-variance gradients of continuous random variables, the reparameterization trick has become the technique of choice for training a variety of latent variable models.
17 code implementations • ICML 2018 • Hyunjik Kim, andriy mnih
We define and address the problem of unsupervised learning of disentangled representations on data generated from independent factors of variation.
1 code implementation • NeurIPS 2017 • Jörg Bornschein, andriy mnih, Daniel Zoran, Danilo J. Rezende
Aiming to augment generative models with external memory, we interpret the output of a memory module with stochastic addressing as a conditional mixture distribution, where a read operation corresponds to sampling a discrete memory address and retrieving the corresponding content from memory.
3 code implementations • NeurIPS 2017 • Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Mohammad Norouzi, andriy mnih, Arnaud Doucet, Yee Whye Teh
When used as a surrogate objective for maximum likelihood estimation in latent variable models, the evidence lower bound (ELBO) produces state-of-the-art results.
3 code implementations • NeurIPS 2017 • George Tucker, andriy mnih, Chris J. Maddison, Dieterich Lawson, Jascha Sohl-Dickstein
Learning in models with discrete latent variables is challenging due to high variance gradient estimators.
no code implementations • 16 Mar 2017 • Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Arnaud Doucet, andriy mnih, Yee Whye Teh
The policy gradients of the expected return objective can react slowly to rare rewards.
5 code implementations • 2 Nov 2016 • Chris J. Maddison, andriy mnih, Yee Whye Teh
The essence of the trick is to refactor each stochastic node into a differentiable function of its parameters and a random variable with fixed distribution.
1 code implementation • 22 Feb 2016 • Andriy Mnih, Danilo J. Rezende
Recent progress in deep latent variable models has largely been driven by the development of flexible and scalable variational inference methods.
2 code implementations • 16 Nov 2015 • Shixiang Gu, Sergey Levine, Ilya Sutskever, andriy mnih
Deep neural networks are powerful parametric models that can be trained efficiently using the backpropagation algorithm.
2 code implementations • 31 Jan 2014 • Andriy Mnih, Karol Gregor
Highly expressive directed latent variable models, such as sigmoid belief networks, are difficult to train on large datasets because exact inference in them is intractable and none of the approximate inference methods that have been applied to them scale well.
no code implementations • NeurIPS 2013 • Andriy Mnih, Koray Kavukcuoglu
Continuous-valued word embeddings learned by neural language models have recently been shown to capture semantic and syntactic information about words very well, setting performance records on several word similarity tasks.
no code implementations • 31 Oct 2013 • Karol Gregor, Ivo Danihelka, andriy mnih, Charles Blundell, Daan Wierstra
We introduce a deep, generative autoencoder capable of learning hierarchies of distributed representations from data.
no code implementations • NeurIPS 2012 • Andriy Mnih, Yee W. Teh
User preferences for items can be inferred from either explicit feedback, such as item ratings, or implicit feedback, such as rental histories.
no code implementations • NeurIPS 2008 • Andriy Mnih, Geoffrey E. Hinton
Neural probabilistic language models (NPLMs) have been shown to be competitive with and occasionally superior to the widely-used n-gram language models.
1 code implementation • ICML: Proceedings of the 24th international conference on Machine learning 2007 • Ruslan Salakhutdinov, andriy mnih, Geoffrey Hinton
Most of the existing approaches to collaborative filtering cannot handle very large data sets.