1 code implementation • 14 Aug 2023 • Alex Graves, Rupesh Kumar Srivastava, Timothy Atkinson, Faustino Gomez
Notably, the network inputs for discrete data lie on the probability simplex, and are therefore natively differentiable, paving the way for gradient-based sample guidance and few-step generation in discrete domains such as language modelling.
Ranked #3 on Image Generation on Binarized MNIST
no code implementations • ICLR 2021 • Jacob Menick, Erich Elsen, Utku Evci, Simon Osindero, Karen Simonyan, Alex Graves
For highly sparse networks, SnAp with $n=2$ remains tractable and can outperform backpropagation through time in terms of learning speed when updates are done online.
no code implementations • 12 Jun 2020 • Jacob Menick, Erich Elsen, Utku Evci, Simon Osindero, Karen Simonyan, Alex Graves
Current methods for training recurrent neural networks are based on backpropagation through time, which requires storing a complete history of network states, and prohibits updating the weights `online' (after every timestep).
no code implementations • 6 Apr 2018 • Alex Graves, Jacob Menick, Aaron van den Oord
We conclude that ACNs are a promising new direction for representation learning: one that steps away from IID modelling, and towards learning a structured description of the dataset as a whole.
no code implementations • ICLR 2018 • Yan Wu, Greg Wayne, Alex Graves, Timothy Lillicrap
We present an end-to-end trained memory system that quickly adapts to new data and generates samples like them.
2 code implementations • ICML 2018 • Aaron van den Oord, Yazhe Li, Igor Babuschkin, Karen Simonyan, Oriol Vinyals, Koray Kavukcuoglu, George van den Driessche, Edward Lockhart, Luis C. Cobo, Florian Stimberg, Norman Casagrande, Dominik Grewe, Seb Noury, Sander Dieleman, Erich Elsen, Nal Kalchbrenner, Heiga Zen, Alex Graves, Helen King, Tom Walters, Dan Belov, Demis Hassabis
The recently-developed WaveNet architecture is the current state of the art in realistic speech synthesis, consistently rated as more natural sounding for many different languages than any previous system.
15 code implementations • ICLR 2018 • Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Remi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg
We introduce NoisyNet, a deep reinforcement learning agent with parametric noise added to its weights, and show that the induced stochasticity of the agent's policy can be used to aid efficient exploration.
Ranked #1 on Atari Games on Atari 2600 Surround
no code implementations • ICML 2017 • Alex Graves, Marc G. Bellemare, Jacob Menick, Remi Munos, Koray Kavukcuoglu
We introduce a method for automatically selecting the path, or syllabus, that a neural network follows through a curriculum so as to maximise learning efficiency.
11 code implementations • 31 Oct 2016 • Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves, Koray Kavukcuoglu
The ByteNet is a one-dimensional convolutional neural network that is composed of two parts, one to encode the source sequence and the other to decode the target sequence.
Ranked #1 on Machine Translation on WMT2015 English-German
no code implementations • NeurIPS 2016 • Jack W. Rae, Jonathan J. Hunt, Tim Harley, Ivo Danihelka, Andrew Senior, Greg Wayne, Alex Graves, Timothy P. Lillicrap
SAM learns with comparable data efficiency to existing models on a range of synthetic tasks and one-shot Omniglot character recognition, and can scale to tasks requiring $100,\! 000$s of time steps and memories.
Ranked #6 on Question Answering on bAbi (Mean Error Rate metric)
1 code implementation • ICML 2017 • Nal Kalchbrenner, Aaron van den Oord, Karen Simonyan, Ivo Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu
The VPN approaches the best possible performance on the Moving MNIST benchmark, a leap over the previous state of the art, and the generated videos show only minor deviations from the ground truth.
Ranked #1 on Video Prediction on KTH (Cond metric)
60 code implementations • 12 Sep 2016 • Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu
This paper introduces WaveNet, a deep neural network for generating raw audio waveforms.
Ranked #1 on Speech Synthesis on Mandarin Chinese
5 code implementations • ICML 2017 • Max Jaderberg, Wojciech Marian Czarnecki, Simon Osindero, Oriol Vinyals, Alex Graves, David Silver, Koray Kavukcuoglu
Training directed neural networks typically requires forward-propagating data through a computation graph, followed by backpropagating error signal, to produce weight updates.
no code implementations • 19 Jul 2016 • Alex Graves
This report describes an alternative transform, applicable to any continuous multivariate distribution with a differentiable density function from which samples can be drawn, and uses it to derive an unbiased estimator for mixture density weight derivatives.
14 code implementations • NeurIPS 2016 • Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, Koray Kavukcuoglu
This work explores conditional image generation with a new image density model based on the PixelCNN architecture.
Ranked #7 on Density Estimation on CIFAR-10
no code implementations • NeurIPS 2016 • Alexander, Vezhnevets, Volodymyr Mnih, John Agapiou, Simon Osindero, Alex Graves, Oriol Vinyals, Koray Kavukcuoglu
We present a novel deep recurrent neural network architecture that learns to build implicit plans in an end-to-end manner by purely interacting with an environment in reinforcement learning setting.
2 code implementations • NeurIPS 2016 • Audrūnas Gruslys, Remi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves
We propose a novel approach to reduce memory consumption of the backpropagation through time (BPTT) algorithm when training recurrent neural networks (RNNs).
5 code implementations • 29 Mar 2016 • Alex Graves
This paper introduces Adaptive Computation Time (ACT), an algorithm that allows recurrent neural networks to learn how many computational steps to take between receiving an input and emitting an output.
3 code implementations • 9 Feb 2016 • Ivo Danihelka, Greg Wayne, Benigno Uria, Nal Kalchbrenner, Alex Graves
We investigate a new method to augment recurrent neural networks with extra memory without increasing the number of network parameters.
70 code implementations • 4 Feb 2016 • Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu
We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers.
Ranked #9 on Atari Games on Atari 2600 Star Gunner
1 code implementation • 6 Jul 2015 • Nal Kalchbrenner, Ivo Danihelka, Alex Graves
This paper introduces Grid Long Short-Term Memory, a network of LSTM cells arranged in a multidimensional grid that can be applied to vectors, sequences or higher dimensional data such as images.
7 code implementations • 25 Feb 2015 • Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg1 & Demis Hassabis
We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters.
20 code implementations • 16 Feb 2015 • Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, Daan Wierstra
This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation.
Ranked #70 on Image Generation on CIFAR-10 (bits/dimension metric)
34 code implementations • 20 Oct 2014 • Alex Graves, Greg Wayne, Ivo Danihelka
We extend the capabilities of neural networks by coupling them to external memory resources, which they can interact with by attentional processes.
19 code implementations • NeurIPS 2014 • Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu
Applying convolutional neural networks to large images is computationally expensive because the amount of computation scales linearly with the number of image pixels.
no code implementations • Proceedings of the 31st International Conference on International Conference on Machine Learning 2014 • Alex Graves, Navdeep Jaitly
A modification to the objective function is introduced that trains the network to minimise the expectation of an arbitrary transcription loss function.
111 code implementations • 19 Dec 2013 • Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller
We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning.
Ranked #1 on Atari Games on Atari 2600 Pong
57 code implementations • 4 Aug 2013 • Alex Graves
This paper shows how Long Short-term Memory recurrent neural networks can be used to generate complex sequences with long-range structure, simply by predicting one data point at a time.
Ranked #41 on Language Modelling on enwik8
5 code implementations • 22 Mar 2013 • Alex Graves, Abdel-rahman Mohamed, Geoffrey Hinton
Recurrent neural networks (RNNs) are a powerful model for sequential data.
Ranked #18 on Speech Recognition on TIMIT
7 code implementations • 14 Nov 2012 • Alex Graves
One of the key challenges in sequence transduction is learning to represent both the input and output sequences in a way that is invariant to sequential distortions such as shrinking, stretching and translating.
no code implementations • NeurIPS 2011 • Alex Graves
Variational methods have been previously explored as a tractable approximation to Bayesian inference for neural networks.
no code implementations • NeurIPS 2008 • Alex Graves, Jürgen Schmidhuber
Offline handwriting recognition---the transcription of images of handwritten text---is an interesting task, in that it combines computer vision with sequence learning.
no code implementations • NeurIPS 2007 • Alex Graves, Marcus Liwicki, Horst Bunke, Jürgen Schmidhuber, Santiago Fernández
On-line handwriting recognition is unusual among sequence labelling tasks in that the underlying generator of the observed data, i. e. the movement of the pen, is recorded directly.
4 code implementations • 14 May 2007 • Alex Graves, Santiago Fernandez, Juergen Schmidhuber
Recurrent neural networks (RNNs) have proved effective at one dimensional sequence learning tasks, such as speech and online handwriting recognition.
1 code implementation • ICML 2006 2006 • Alex Graves, Santiago Fernández, Faustino Gomez, Jürgen Schmidhuber
Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data.