Search Results for author: Alex Graves

Found 35 papers, 23 papers with code

Bayesian Flow Networks

1 code implementation • 14 Aug 2023 • Alex Graves, Rupesh Kumar Srivastava, Timothy Atkinson, Faustino Gomez

Notably, the network inputs for discrete data lie on the probability simplex, and are therefore natively differentiable, paving the way for gradient-based sample guidance and few-step generation in discrete domains such as language modelling.

Ranked #3 on Image Generation on Binarized MNIST

Bayesian Inference Data Compression +2

202

Paper
Code

Practical Real Time Recurrent Learning with a Sparse Approximation

no code implementations • ICLR 2021 • Jacob Menick, Erich Elsen, Utku Evci, Simon Osindero, Karen Simonyan, Alex Graves

For highly sparse networks, SnAp with $n=2$ remains tractable and can outperform backpropagation through time in terms of learning speed when updates are done online.

Paper
Add Code

A Practical Sparse Approximation for Real Time Recurrent Learning

no code implementations • 12 Jun 2020 • Jacob Menick, Erich Elsen, Utku Evci, Simon Osindero, Karen Simonyan, Alex Graves

Current methods for training recurrent neural networks are based on backpropagation through time, which requires storing a complete history of network states, and prohibits updating the weights `online' (after every timestep).

Paper
Add Code

Associative Compression Networks for Representation Learning

no code implementations • 6 Apr 2018 • Alex Graves, Jacob Menick, Aaron van den Oord

We conclude that ACNs are a promising new direction for representation learning: one that steps away from IID modelling, and towards learning a structured description of the dataset as a whole.

Representation Learning

Paper
Add Code

The Kanerva Machine: A Generative Distributed Memory

no code implementations • ICLR 2018 • Yan Wu, Greg Wayne, Alex Graves, Timothy Lillicrap

We present an end-to-end trained memory system that quickly adapts to new data and generates samples like them.

Paper
Add Code

Parallel WaveNet: Fast High-Fidelity Speech Synthesis

2 code implementations • ICML 2018 • Aaron van den Oord, Yazhe Li, Igor Babuschkin, Karen Simonyan, Oriol Vinyals, Koray Kavukcuoglu, George van den Driessche, Edward Lockhart, Luis C. Cobo, Florian Stimberg, Norman Casagrande, Dominik Grewe, Seb Noury, Sander Dieleman, Erich Elsen, Nal Kalchbrenner, Heiga Zen, Alex Graves, Helen King, Tom Walters, Dan Belov, Demis Hassabis

The recently-developed WaveNet architecture is the current state of the art in realistic speech synthesis, consistently rated as more natural sounding for many different languages than any previous system.

Speech Synthesis Vocal Bursts Intensity Prediction

Paper
Code

Noisy Networks for Exploration

15 code implementations • ICLR 2018 • Meire Fortunato, Mohammad Gheshlaghi Azar, Bilal Piot, Jacob Menick, Ian Osband, Alex Graves, Vlad Mnih, Remi Munos, Demis Hassabis, Olivier Pietquin, Charles Blundell, Shane Legg

We introduce NoisyNet, a deep reinforcement learning agent with parametric noise added to its weights, and show that the induced stochasticity of the agent's policy can be used to aid efficient exploration.

Ranked #1 on Atari Games on Atari 2600 Surround

Atari Games Efficient Exploration +2

2,513

Paper
Code

Automated Curriculum Learning for Neural Networks

no code implementations • ICML 2017 • Alex Graves, Marc G. Bellemare, Jacob Menick, Remi Munos, Koray Kavukcuoglu

We introduce a method for automatically selecting the path, or syllabus, that a neural network follows through a curriculum so as to maximise learning efficiency.

Paper
Add Code

Neural Machine Translation in Linear Time

11 code implementations • 31 Oct 2016 • Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves, Koray Kavukcuoglu

The ByteNet is a one-dimensional convolutional neural network that is composed of two parts, one to encode the source sequence and the other to decode the target sequence.

Ranked #1 on Machine Translation on WMT2015 English-German

Language Modelling Machine Translation +2

319

Paper
Code

Scaling Memory-Augmented Neural Networks with Sparse Reads and Writes

no code implementations • NeurIPS 2016 • Jack W. Rae, Jonathan J. Hunt, Tim Harley, Ivo Danihelka, Andrew Senior, Greg Wayne, Alex Graves, Timothy P. Lillicrap

SAM learns with comparable data efficiency to existing models on a range of synthetic tasks and one-shot Omniglot character recognition, and can scale to tasks requiring $100,\! 000$s of time steps and memories.

Ranked #6 on Question Answering on bAbi (Mean Error Rate metric)

Language Modelling Machine Translation +2

Paper
Add Code

Video Pixel Networks

1 code implementation • ICML 2017 • Nal Kalchbrenner, Aaron van den Oord, Karen Simonyan, Ivo Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu

The VPN approaches the best possible performance on the Moving MNIST benchmark, a leap over the previous state of the art, and the generated videos show only minor deviations from the ground truth.

Ranked #1 on Video Prediction on KTH (Cond metric)

Video Prediction

Paper
Code

WaveNet: A Generative Model for Raw Audio

60 code implementations • 12 Sep 2016 • Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu

This paper introduces WaveNet, a deep neural network for generating raw audio waveforms.

Ranked #1 on Speech Synthesis on Mandarin Chinese

Audio Generation Speech Synthesis

5,400

Paper
Code

Decoupled Neural Interfaces using Synthetic Gradients

5 code implementations • ICML 2017 • Max Jaderberg, Wojciech Marian Czarnecki, Simon Osindero, Oriol Vinyals, Alex Graves, David Silver, Koray Kavukcuoglu

Training directed neural networks typically requires forward-propagating data through a computation graph, followed by backpropagating error signal, to produce weight updates.

235

Paper
Code

Stochastic Backpropagation through Mixture Density Distributions

no code implementations • 19 Jul 2016 • Alex Graves

This report describes an alternative transform, applicable to any continuous multivariate distribution with a differentiable density function from which samples can be drawn, and uses it to derive an unbiased estimator for mixture density weight derivatives.

Variational Inference

Paper
Add Code

Conditional Image Generation with PixelCNN Decoders

14 code implementations • NeurIPS 2016 • Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, Koray Kavukcuoglu

This work explores conditional image generation with a new image density model based on the PixelCNN architecture.

Ranked #7 on Density Estimation on CIFAR-10

Conditional Image Generation Density Estimation +1

1,888

Paper
Code

Strategic Attentive Writer for Learning Macro-Actions

no code implementations • NeurIPS 2016 • Alexander, Vezhnevets, Volodymyr Mnih, John Agapiou, Simon Osindero, Alex Graves, Oriol Vinyals, Koray Kavukcuoglu

We present a novel deep recurrent neural network architecture that learns to build implicit plans in an end-to-end manner by purely interacting with an environment in reinforcement learning setting.

Atari Games

Paper
Add Code

Memory-Efficient Backpropagation Through Time

2 code implementations • NeurIPS 2016 • Audrūnas Gruslys, Remi Munos, Ivo Danihelka, Marc Lanctot, Alex Graves

We propose a novel approach to reduce memory consumption of the backpropagation through time (BPTT) algorithm when training recurrent neural networks (RNNs).

2,613

Paper
Code

Adaptive Computation Time for Recurrent Neural Networks

5 code implementations • 29 Mar 2016 • Alex Graves

This paper introduces Adaptive Computation Time (ACT), an algorithm that allows recurrent neural networks to learn how many computational steps to take between receiving an input and emitting an output.

Language Modelling

47,594

Paper
Code

Associative Long Short-Term Memory

3 code implementations • 9 Feb 2016 • Ivo Danihelka, Greg Wayne, Benigno Uria, Nal Kalchbrenner, Alex Graves

We investigate a new method to augment recurrent neural networks with extra memory without increasing the number of network parameters.

Memorization Retrieval

110

Paper
Code

Asynchronous Methods for Deep Reinforcement Learning

70 code implementations • 4 Feb 2016 • Volodymyr Mnih, Adrià Puigdomènech Badia, Mehdi Mirza, Alex Graves, Timothy P. Lillicrap, Tim Harley, David Silver, Koray Kavukcuoglu

We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers.

Ranked #9 on Atari Games on Atari 2600 Star Gunner

Atari Games reinforcement-learning +1

30,980

Paper
Code

Grid Long Short-Term Memory

1 code implementation • 6 Jul 2015 • Nal Kalchbrenner, Ivo Danihelka, Alex Graves

This paper introduces Grid Long Short-Term Memory, a network of LSTM cells arranged in a multidimensional grid that can be applied to vectors, sequences or higher dimensional data such as images.

Language Modelling Memorization +1

Paper
Code

Human level control through deep reinforcement learning

7 code implementations • 25 Feb 2015 • Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg1 & Demis Hassabis

We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters.

Atari Games reinforcement-learning +1

143

Paper
Code

DRAW: A Recurrent Neural Network For Image Generation

20 code implementations • 16 Feb 2015 • Karol Gregor, Ivo Danihelka, Alex Graves, Danilo Jimenez Rezende, Daan Wierstra

This paper introduces the Deep Recurrent Attentive Writer (DRAW) neural network architecture for image generation.

Ranked #70 on Image Generation on CIFAR-10 (bits/dimension metric)

Foveation Image Generation

526

Paper
Code

Neural Turing Machines

34 code implementations • 20 Oct 2014 • Alex Graves, Greg Wayne, Ivo Danihelka

We extend the capabilities of neural networks by coupling them to external memory resources, which they can interact with by attentional processes.

1,042

Paper
Code

Recurrent Models of Visual Attention

19 code implementations • NeurIPS 2014 • Volodymyr Mnih, Nicolas Heess, Alex Graves, Koray Kavukcuoglu

Applying convolutional neural networks to large images is computationally expensive because the amount of computation scales linearly with the number of image pixels.

Hard Attention Image Classification +1

466

Paper
Code

Towards End-To-End Speech Recognition with Recurrent Neural Networks

no code implementations • Proceedings of the 31st International Conference on International Conference on Machine Learning 2014 • Alex Graves, Navdeep Jaitly

A modification to the objective function is introduced that trains the network to minimise the expectation of an arbitrary transcription loss function.

Language Modelling speech-recognition +1

Paper
Add Code

Playing Atari with Deep Reinforcement Learning

111 code implementations • 19 Dec 2013 • Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, Martin Riedmiller

We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning.

Ranked #1 on Atari Games on Atari 2600 Pong

Atari Games Q-Learning +1

47,594

Paper
Code

Generating Sequences With Recurrent Neural Networks

57 code implementations • 4 Aug 2013 • Alex Graves

This paper shows how Long Short-term Memory recurrent neural networks can be used to generate complex sequences with long-range structure, simply by predicting one data point at a time.

Ranked #41 on Language Modelling on enwik8

Language Modelling Text Generation

11,414

Paper
Code

Speech Recognition with Deep Recurrent Neural Networks

5 code implementations • 22 Mar 2013 • Alex Graves, Abdel-rahman Mohamed, Geoffrey Hinton

Recurrent neural networks (RNNs) are a powerful model for sequential data.

Ranked #18 on Speech Recognition on TIMIT

Handwriting Recognition Speech Recognition

302

Paper
Code

Sequence Transduction with Recurrent Neural Networks

7 code implementations • 14 Nov 2012 • Alex Graves

One of the key challenges in sequence transduction is learning to represent both the input and output sequences in a way that is invariant to sequential distortions such as shrinking, stretching and translating.

Machine Translation Speech Recognition +1

900

Paper
Code

Practical Variational Inference for Neural Networks

no code implementations • NeurIPS 2011 • Alex Graves

Variational methods have been previously explored as a tractable approximation to Bayesian inference for neural networks.

Bayesian Inference Variational Inference

Paper
Add Code

Offline Handwriting Recognition with Multidimensional Recurrent Neural Networks

no code implementations • NeurIPS 2008 • Alex Graves, Jürgen Schmidhuber

Offline handwriting recognition---the transcription of images of handwritten text---is an interesting task, in that it combines computer vision with sequence learning.

Handwriting Recognition

Paper
Add Code

Unconstrained On-line Handwriting Recognition with Recurrent Neural Networks

no code implementations • NeurIPS 2007 • Alex Graves, Marcus Liwicki, Horst Bunke, Jürgen Schmidhuber, Santiago Fernández

On-line handwriting recognition is unusual among sequence labelling tasks in that the underlying generator of the observed data, i. e. the movement of the pen, is recorded directly.

Handwriting Recognition Language Modelling

Paper
Add Code

Multi-Dimensional Recurrent Neural Networks

4 code implementations • 14 May 2007 • Alex Graves, Santiago Fernandez, Juergen Schmidhuber

Recurrent neural networks (RNNs) have proved effective at one dimensional sequence learning tasks, such as speech and online handwriting recognition.

Handwriting Recognition Image Segmentation +1

155

Paper
Code

Connectionist Temporal Classification: Labelling Unsegmented Sequence Data with Recurrent Neural Networks

1 code implementation • ICML 2006 2006 • Alex Graves, Santiago Fernández, Faustino Gomez, Jürgen Schmidhuber

Many real-world sequence learning tasks require the prediction of sequences of labels from noisy, unsegmented input data.

speech-recognition Speech Recognition

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.