You need to log in to edit.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

You can create a new account if you don't have one.

Or, discuss a change on Slack.

no code implementations • NeurIPS Workshop ICBINB 2021 • Wouter Kool, Chris J. Maddison, andriy mnih

Training large-scale mixture of experts models efficiently on modern hardware requires assigning datapoints in a batch to different experts, each with a limited capacity.

no code implementations • NeurIPS 2021 • Zhe Dong, andriy mnih, George Tucker

Training models with discrete latent variables is challenging due to the high variance of unbiased gradient estimators.

no code implementations • 26 Jan 2021 • Matthias Bauer, andriy mnih

Efficient low-variance gradient estimation enabled by the reparameterization trick (RT) has been essential to the success of variational autoencoders.

no code implementations • pproximateinference AABI Symposium 2021 • Matthias Bauer, andriy mnih

Efficient low-variance gradient estimation enabled by the reparameterization trick (RT) has been essential to the success of variational autoencoders.

no code implementations • NeurIPS 2020 • Zhe Dong, andriy mnih, George Tucker

Applying antithetic sampling over the augmenting variables yields a relatively low-variance and unbiased estimator applicable to any model with binary latent variables.

no code implementations • 8 Jun 2020 • Hyunjik Kim, George Papamakarios, Andriy Mnih

Lipschitz constants of neural networks have been explored in various contexts in deep learning, such as provable adversarial robustness, estimating Wasserstein distance, stabilising training of GANs, and formulating invertible neural networks.

no code implementations • 22 Jan 2020 • Tom Van de Wiele, David Warde-Farley, andriy mnih, Volodymyr Mnih

Applying Q-learning to high-dimensional or continuous action spaces can be difficult due to the required maximization over the set of possible actions.

1 code implementation • pproximateinference AABI Symposium 2019 • Jiaxin Shi, Michalis K. Titsias, andriy mnih

We introduce a new interpretation of sparse variational approximations for Gaussian processes using inducing points, which can lead to more scalable algorithms than previous methods.

2 code implementations • 25 Jun 2019 • Shakir Mohamed, Mihaela Rosca, Michael Figurnov, andriy mnih

This paper is a broad and accessible survey of the methods we have at our disposal for Monte Carlo gradient estimation in machine learning and across the statistical sciences: the problem of computing the gradient of an expectation of a function with respect to parameters defining the distribution that is integrated; the problem of sensitivity analysis.

no code implementations • ICLR 2019 • catalin ionescu, tejas kulkarni, aaron van de oord, andriy mnih, Vlad Mnih

Exploration in environments with sparse rewards is a key challenge for reinforcement learning.

6 code implementations • ICLR 2019 • Hyunjik Kim, andriy mnih, Jonathan Schwarz, Marta Garnelo, Ali Eslami, Dan Rosenbaum, Oriol Vinyals, Yee Whye Teh

Neural Processes (NPs) (Garnelo et al 2018a;b) approach regression by learning to map a context set of observed input-output pairs to a distribution over regression functions.

no code implementations • 26 Oct 2018 • Matthias Bauer, andriy mnih

We propose Learned Accept/Reject Sampling (LARS), a method for constructing richer priors using rejection sampling with a learned acceptance function.

1 code implementation • NeurIPS 2018 • Michael Figurnov, Shakir Mohamed, andriy mnih

By providing a simple and efficient way of computing low-variance gradients of continuous random variables, the reparameterization trick has become the technique of choice for training a variety of latent variable models.

16 code implementations • ICML 2018 • Hyunjik Kim, andriy mnih

We define and address the problem of unsupervised learning of disentangled representations on data generated from independent factors of variation.

1 code implementation • NeurIPS 2017 • Jörg Bornschein, andriy mnih, Daniel Zoran, Danilo J. Rezende

Aiming to augment generative models with external memory, we interpret the output of a memory module with stochastic addressing as a conditional mixture distribution, where a read operation corresponds to sampling a discrete memory address and retrieving the corresponding content from memory.

3 code implementations • NeurIPS 2017 • Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Mohammad Norouzi, andriy mnih, Arnaud Doucet, Yee Whye Teh

When used as a surrogate objective for maximum likelihood estimation in latent variable models, the evidence lower bound (ELBO) produces state-of-the-art results.

3 code implementations • NeurIPS 2017 • George Tucker, andriy mnih, Chris J. Maddison, Dieterich Lawson, Jascha Sohl-Dickstein

Learning in models with discrete latent variables is challenging due to high variance gradient estimators.

no code implementations • 16 Mar 2017 • Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Arnaud Doucet, andriy mnih, Yee Whye Teh

The policy gradients of the expected return objective can react slowly to rare rewards.

4 code implementations • 2 Nov 2016 • Chris J. Maddison, andriy mnih, Yee Whye Teh

The essence of the trick is to refactor each stochastic node into a differentiable function of its parameters and a random variable with fixed distribution.

1 code implementation • 22 Feb 2016 • Andriy Mnih, Danilo J. Rezende

Recent progress in deep latent variable models has largely been driven by the development of flexible and scalable variational inference methods.

2 code implementations • 16 Nov 2015 • Shixiang Gu, Sergey Levine, Ilya Sutskever, andriy mnih

Deep neural networks are powerful parametric models that can be trained efficiently using the backpropagation algorithm.

2 code implementations • 31 Jan 2014 • Andriy Mnih, Karol Gregor

Highly expressive directed latent variable models, such as sigmoid belief networks, are difficult to train on large datasets because exact inference in them is intractable and none of the approximate inference methods that have been applied to them scale well.

no code implementations • NeurIPS 2013 • Andriy Mnih, Koray Kavukcuoglu

Continuous-valued word embeddings learned by neural language models have recently been shown to capture semantic and syntactic information about words very well, setting performance records on several word similarity tasks.

no code implementations • 31 Oct 2013 • Karol Gregor, Ivo Danihelka, andriy mnih, Charles Blundell, Daan Wierstra

We introduce a deep, generative autoencoder capable of learning hierarchies of distributed representations from data.

no code implementations • NeurIPS 2012 • Andriy Mnih, Yee W. Teh

User preferences for items can be inferred from either explicit feedback, such as item ratings, or implicit feedback, such as rental histories.

no code implementations • NeurIPS 2008 • Andriy Mnih, Geoffrey E. Hinton

Neural probabilistic language models (NPLMs) have been shown to be competitive with and occasionally superior to the widely-used n-gram language models.

1 code implementation • ICML: Proceedings of the 24th international conference on Machine learning 2007 • Ruslan Salakhutdinov, andriy mnih, Geoffrey Hinton

Most of the existing approaches to collaborative filtering cannot handle very large data sets.

Cannot find the paper you are looking for? You can
Submit a new open access paper.

Contact us on:
hello@paperswithcode.com
.
Papers With Code is a free resource with all data licensed under CC-BY-SA.