Search Results for author: andriy mnih

Found 28 papers, 14 papers with code

A Scalable Hierarchical Distributed Language Model

no code implementations NeurIPS 2008 Andriy Mnih, Geoffrey E. Hinton

Neural probabilistic language models (NPLMs) have been shown to be competitive with and occasionally superior to the widely-used n-gram language models.

Language Modelling

Learning Label Trees for Probabilistic Modelling of Implicit Feedback

no code implementations NeurIPS 2012 Andriy Mnih, Yee W. Teh

User preferences for items can be inferred from either explicit feedback, such as item ratings, or implicit feedback, such as rental histories.

Collaborative Filtering

Deep AutoRegressive Networks

no code implementations31 Oct 2013 Karol Gregor, Ivo Danihelka, andriy mnih, Charles Blundell, Daan Wierstra

We introduce a deep, generative autoencoder capable of learning hierarchies of distributed representations from data.

Atari Games

Learning word embeddings efficiently with noise-contrastive estimation

no code implementations NeurIPS 2013 Andriy Mnih, Koray Kavukcuoglu

Continuous-valued word embeddings learned by neural language models have recently been shown to capture semantic and syntactic information about words very well, setting performance records on several word similarity tasks.

Learning Word Embeddings Word Similarity

Neural Variational Inference and Learning in Belief Networks

2 code implementations31 Jan 2014 Andriy Mnih, Karol Gregor

Highly expressive directed latent variable models, such as sigmoid belief networks, are difficult to train on large datasets because exact inference in them is intractable and none of the approximate inference methods that have been applied to them scale well.

Variational Inference

MuProp: Unbiased Backpropagation for Stochastic Neural Networks

2 code implementations16 Nov 2015 Shixiang Gu, Sergey Levine, Ilya Sutskever, andriy mnih

Deep neural networks are powerful parametric models that can be trained efficiently using the backpropagation algorithm.

Variational inference for Monte Carlo objectives

1 code implementation22 Feb 2016 Andriy Mnih, Danilo J. Rezende

Recent progress in deep latent variable models has largely been driven by the development of flexible and scalable variational inference methods.

Variational Inference

The Concrete Distribution: A Continuous Relaxation of Discrete Random Variables

5 code implementations2 Nov 2016 Chris J. Maddison, andriy mnih, Yee Whye Teh

The essence of the trick is to refactor each stochastic node into a differentiable function of its parameters and a random variable with fixed distribution.

Density Estimation Structured Prediction

Filtering Variational Objectives

3 code implementations NeurIPS 2017 Chris J. Maddison, Dieterich Lawson, George Tucker, Nicolas Heess, Mohammad Norouzi, andriy mnih, Arnaud Doucet, Yee Whye Teh

When used as a surrogate objective for maximum likelihood estimation in latent variable models, the evidence lower bound (ELBO) produces state-of-the-art results.

Variational Memory Addressing in Generative Models

1 code implementation NeurIPS 2017 Jörg Bornschein, andriy mnih, Daniel Zoran, Danilo J. Rezende

Aiming to augment generative models with external memory, we interpret the output of a memory module with stochastic addressing as a conditional mixture distribution, where a read operation corresponds to sampling a discrete memory address and retrieving the corresponding content from memory.

Few-Shot Learning Representation Learning +1

Disentangling by Factorising

17 code implementations ICML 2018 Hyunjik Kim, andriy mnih

We define and address the problem of unsupervised learning of disentangled representations on data generated from independent factors of variation.

Disentanglement

Implicit Reparameterization Gradients

1 code implementation NeurIPS 2018 Michael Figurnov, Shakir Mohamed, andriy mnih

By providing a simple and efficient way of computing low-variance gradients of continuous random variables, the reparameterization trick has become the technique of choice for training a variety of latent variable models.

Resampled Priors for Variational Autoencoders

no code implementations26 Oct 2018 Matthias Bauer, andriy mnih

We propose Learned Accept/Reject Sampling (LARS), a method for constructing richer priors using rejection sampling with a learned acceptance function.

Attentive Neural Processes

7 code implementations ICLR 2019 Hyunjik Kim, andriy mnih, Jonathan Schwarz, Marta Garnelo, Ali Eslami, Dan Rosenbaum, Oriol Vinyals, Yee Whye Teh

Neural Processes (NPs) (Garnelo et al 2018a;b) approach regression by learning to map a context set of observed input-output pairs to a distribution over regression functions.

regression

Monte Carlo Gradient Estimation in Machine Learning

2 code implementations25 Jun 2019 Shakir Mohamed, Mihaela Rosca, Michael Figurnov, andriy mnih

This paper is a broad and accessible survey of the methods we have at our disposal for Monte Carlo gradient estimation in machine learning and across the statistical sciences: the problem of computing the gradient of an expectation of a function with respect to parameters defining the distribution that is integrated; the problem of sensitivity analysis.

BIG-bench Machine Learning

Sparse Orthogonal Variational Inference for Gaussian Processes

1 code implementation pproximateinference AABI Symposium 2019 Jiaxin Shi, Michalis K. Titsias, andriy mnih

We introduce a new interpretation of sparse variational approximations for Gaussian processes using inducing points, which can lead to more scalable algorithms than previous methods.

Gaussian Processes Multi-class Classification +2

Q-Learning in enormous action spaces via amortized approximate maximization

no code implementations22 Jan 2020 Tom Van de Wiele, David Warde-Farley, andriy mnih, Volodymyr Mnih

Applying Q-learning to high-dimensional or continuous action spaces can be difficult due to the required maximization over the set of possible actions.

Continuous Control Q-Learning

The Lipschitz Constant of Self-Attention

no code implementations8 Jun 2020 Hyunjik Kim, George Papamakarios, Andriy Mnih

Lipschitz constants of neural networks have been explored in various contexts in deep learning, such as provable adversarial robustness, estimating Wasserstein distance, stabilising training of GANs, and formulating invertible neural networks.

Adversarial Robustness Language Modelling

DisARM: An Antithetic Gradient Estimator for Binary Latent Variables

no code implementations NeurIPS 2020 Zhe Dong, andriy mnih, George Tucker

Applying antithetic sampling over the augmenting variables yields a relatively low-variance and unbiased estimator applicable to any model with binary latent variables.

Generalized Doubly-Reparameterized Gradient Estimators

no code implementations pproximateinference AABI Symposium 2021 Matthias Bauer, andriy mnih

Efficient low-variance gradient estimation enabled by the reparameterization trick (RT) has been essential to the success of variational autoencoders.

Generalized Doubly Reparameterized Gradient Estimators

no code implementations26 Jan 2021 Matthias Bauer, andriy mnih

Efficient low-variance gradient estimation enabled by the reparameterization trick (RT) has been essential to the success of variational autoencoders.

Coupled Gradient Estimators for Discrete Latent Variables

no code implementations NeurIPS 2021 Zhe Dong, andriy mnih, George Tucker

Training models with discrete latent variables is challenging due to the high variance of unbiased gradient estimators.

Unbiased Gradient Estimation with Balanced Assignments for Mixtures of Experts

no code implementations NeurIPS Workshop ICBINB 2021 Wouter Kool, Chris J. Maddison, andriy mnih

Training large-scale mixture of experts models efficiently on modern hardware requires assigning datapoints in a batch to different experts, each with a limited capacity.

Compositional Score Modeling for Simulation-based Inference

1 code implementation28 Sep 2022 Tomas Geffner, George Papamakarios, andriy mnih

Neural Posterior Estimation methods for simulation-based inference can be ill-suited for dealing with posterior distributions obtained by conditioning on multiple observations, as they tend to require a large number of simulator calls to learn accurate approximations.

Variational Inference

Cannot find the paper you are looking for? You can Submit a new open access paper.