1 code implementation • ICLR 2022 • Nikolay Savinov, Junyoung Chung, Mikolaj Binkowski, Erich Elsen, Aaron van den Oord
In this paper we propose a new generative model of text, Step-unrolled Denoising Autoencoder (SUNDAE), that does not rely on autoregressive models.
no code implementations • ICCV 2021 • Yonglong Tian, Olivier J. Henaff, Aaron van den Oord
Self-supervised learning holds promise in leveraging large amounts of unlabeled data, however much of its progress has thus far been limited to highly curated pre-training data such as ImageNet.
2 code implementations • ICCV 2021 • Olivier J. Hénaff, Skanda Koppula, Jean-Baptiste Alayrac, Aaron van den Oord, Oriol Vinyals, João Carreira
Self-supervised pretraining has been shown to yield powerful representations for transfer learning.
Ranked #60 on Semantic Segmentation on Cityscapes val (using extra training data)
no code implementations • 11 Mar 2021 • Luyu Wang, Aaron van den Oord
Recent advances suggest the advantage of multi-modal training in comparison with single-modal methods.
Ranked #18 on Audio Classification on ESC-50 (using extra training data)
no code implementations • ICLR 2021 • Sven Gowal, Po-Sen Huang, Aaron van den Oord, Timothy Mann, Pushmeet Kohli
Experiments on CIFAR-10 against $\ell_2$ and $\ell_\infty$ norm-bounded perturbations demonstrate that BYORL achieves near state-of-the-art robustness with as little as 500 labeled examples.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Kazuya Kawakami, Luyu Wang, Chris Dyer, Phil Blunsom, Aaron van den Oord
Unsupervised speech representation learning has shown remarkable success at finding representations that correlate with phonetic structures and improve downstream speech recognition performance.
no code implementations • 25 Sep 2019 • Kazuya Kawakami, Luyu Wang, Chris Dyer, Phil Blunsom, Aaron van den Oord
We present an unsupervised method for learning speech representations based on a bidirectional contrastive predictive coding that implicitly discovers phonetic structure from large-scale corpora of unlabelled raw audio signals.
no code implementations • NeurIPS 2019 • Karol Gregor, Danilo Jimenez Rezende, Frederic Besse, Yan Wu, Hamza Merzic, Aaron van den Oord
We propose a way to efficiently train expressive generative models in complex environments.
15 code implementations • NeurIPS 2019 • Ali Razavi, Aaron van den Oord, Oriol Vinyals
We explore the use of Vector Quantized Variational AutoEncoder (VQ-VAE) models for large scale image generation.
4 code implementations • ICML 2020 • Olivier J. Hénaff, Aravind Srinivas, Jeffrey De Fauw, Ali Razavi, Carl Doersch, S. M. Ali Eslami, Aaron van den Oord
Human observers can learn to recognize new categories of images from a handful of examples, yet doing so with artificial ones remains an open challenge.
Ranked #6 on Contrastive Learning on imagenet-1k
3 code implementations • 16 May 2019 • Ben Poole, Sherjil Ozair, Aaron van den Oord, Alexander A. Alemi, George Tucker
Estimating and optimizing Mutual Information (MI) is core to many problems in machine learning; however, bounding MI in high dimensions is challenging.
no code implementations • NeurIPS 2019 • Sherjil Ozair, Corey Lynch, Yoshua Bengio, Aaron van den Oord, Sergey Levine, Pierre Sermanet
Mutual information maximization has emerged as a powerful learning objective for unsupervised representation learning obtaining state-of-the-art performance in applications such as object recognition, speech recognition, and reinforcement learning.
no code implementations • ICLR Workshop DeepGenStruct 2019 • Ali Razavi, Aaron van den Oord, Oriol Vinyals
We explore the use of Vector Quantized Variational AutoEncoder (VQ-VAE) models for large scale image generation.
26 code implementations • 10 Jul 2018 • Aaron van den Oord, Yazhe Li, Oriol Vinyals
The key insight of our model is to learn such representations by predicting the future in latent space by using powerful autoregressive models.
Ranked #31 on Semi-Supervised Image Classification on ImageNet - 1% labeled data (Top 5 Accuracy metric)
no code implementations • 6 Apr 2018 • Alex Graves, Jacob Menick, Aaron van den Oord
We conclude that ACNs are a promising new direction for representation learning: one that steps away from IID modelling, and towards learning a structured description of the dataset as a whole.
16 code implementations • ICML 2018 • Nal Kalchbrenner, Erich Elsen, Karen Simonyan, Seb Noury, Norman Casagrande, Edward Lockhart, Florian Stimberg, Aaron van den Oord, Sander Dieleman, Koray Kavukcuoglu
The small number of weights in a Sparse WaveRNN makes it possible to sample high-fidelity audio on a mobile CPU in real time.
no code implementations • ICML 2018 • Jonathan Uesato, Brendan O'Donoghue, Aaron van den Oord, Pushmeet Kohli
We motivate 'adversarial risk' as an objective for achieving models robust to worst-case inputs.
2 code implementations • ICML 2018 • Aaron van den Oord, Yazhe Li, Igor Babuschkin, Karen Simonyan, Oriol Vinyals, Koray Kavukcuoglu, George van den Driessche, Edward Lockhart, Luis C. Cobo, Florian Stimberg, Norman Casagrande, Dominik Grewe, Seb Noury, Sander Dieleman, Erich Elsen, Nal Kalchbrenner, Heiga Zen, Alex Graves, Helen King, Tom Walters, Dan Belov, Demis Hassabis
The recently-developed WaveNet architecture is the current state of the art in realistic speech synthesis, consistently rated as more natural sounding for many different languages than any previous system.
47 code implementations • NeurIPS 2017 • Aaron van den Oord, Oriol Vinyals, Koray Kavukcuoglu
Learning useful representations without supervision remains a key challenge in machine learning.
1 code implementation • ICML 2017 • Georg Ostrovski, Marc G. Bellemare, Aaron van den Oord, Remi Munos
This pseudo-count was used to generate an exploration bonus for a DQN agent and combined with a mixed Monte Carlo update was sufficient to achieve state of the art on the Atari 2600 game Montezuma's Revenge.
Ranked #9 on Atari Games on Atari 2600 Montezuma's Revenge
11 code implementations • 31 Oct 2016 • Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves, Koray Kavukcuoglu
The ByteNet is a one-dimensional convolutional neural network that is composed of two parts, one to encode the source sequence and the other to decode the target sequence.
Ranked #1 on Machine Translation on WMT2015 English-German
1 code implementation • ICML 2017 • Nal Kalchbrenner, Aaron van den Oord, Karen Simonyan, Ivo Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu
The VPN approaches the best possible performance on the Moving MNIST benchmark, a leap over the previous state of the art, and the generated videos show only minor deviations from the ground truth.
Ranked #1 on Video Prediction on KTH (Cond metric)
61 code implementations • 12 Sep 2016 • Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu
This paper introduces WaveNet, a deep neural network for generating raw audio waveforms.
Ranked #1 on Speech Synthesis on Mandarin Chinese
14 code implementations • NeurIPS 2016 • Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, Koray Kavukcuoglu
This work explores conditional image generation with a new image density model based on the PixelCNN architecture.
Ranked #7 on Density Estimation on CIFAR-10
18 code implementations • 25 Jan 2016 • Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu
Modeling the distribution of natural images is a landmark problem in unsupervised learning.
Ranked #5 on Image Generation on Binarized MNIST
no code implementations • NeurIPS 2014 • Aaron Van Den Oord, Benjamin Schrauwen
In this paper we propose a new scalable deep generative model for images, called the Deep Gaussian Mixture Model, that is a straightforward but powerful generalization of GMMs to multiple layers.
Ranked #72 on Image Generation on CIFAR-10 (bits/dimension metric)
no code implementations • NeurIPS 2013 • Aaron Van Den Oord, Sander Dieleman, Benjamin Schrauwen
We also show that recent advances in deep learning translate very well to the music recommendation setting, with deep convolutional neural networks significantly outperforming the traditional approach.