no code implementations • 6 Jun 2023 • Marcin Andrychowicz, Lasse Espeholt, Di Li, Samier Merchant, Alexander Merose, Fred Zyda, Shreya Agrawal, Nal Kalchbrenner
The ability of neural models to make a prediction in less than a second once the data is available and to do so with very high temporal and spatial resolution, and the ability to learn directly from atmospheric observations, are just some of these models' unique advantages.
no code implementations • 9 Mar 2022 • Manoj Kumar, Neil Houlsby, Nal Kalchbrenner, Ekin D. Cubuk
Perceptual distances between images, as measured in the space of pre-trained deep features, have outperformed prior low-level, pixel-based metrics on assessing perceptual similarity.
2 code implementations • 14 Nov 2021 • Lasse Espeholt, Shreya Agrawal, Casper Sønderby, Manoj Kumar, Jonathan Heek, Carla Bromberg, Cenk Gazen, Jason Hickey, Aaron Bell, Nal Kalchbrenner
An emerging class of weather models based on neural networks represents a paradigm shift in weather forecasting: the models learn the required transformations from data instead of relying on hand-coded physics and are computationally efficient.
no code implementations • 29 Sep 2021 • Samira Abnar, Rianne van den Berg, Golnaz Ghiasi, Mostafa Dehghani, Nal Kalchbrenner, Hanie Sedghi
It is shown that under the following two assumptions: (a) access to samples from intermediate distributions, and (b) samples being annotated with the amount of change from the source distribution; self-training can be successfully applied on gradually shifted samples to adapt the model toward the target distribution.
1 code implementation • 10 Jun 2021 • Samira Abnar, Rianne van den Berg, Golnaz Ghiasi, Mostafa Dehghani, Nal Kalchbrenner, Hanie Sedghi
It has been shown that under the following two assumptions: (a) access to samples from intermediate distributions, and (b) samples being annotated with the amount of change from the source distribution, self-training can be successfully applied on gradually shifted samples to adapt the model toward the target distribution.
no code implementations • 22 Feb 2021 • Bernhard Schölkopf, Francesco Locatello, Stefan Bauer, Nan Rosemary Ke, Nal Kalchbrenner, Anirudh Goyal, Yoshua Bengio
The two fields of machine learning and graphical causality arose and developed separately.
2 code implementations • ICLR 2021 • Manoj Kumar, Dirk Weissenborn, Nal Kalchbrenner
We present the Colorization Transformer, a novel approach for diverse high fidelity image colorization based on self-attention.
Ranked #2 on Colorization on ImageNet val
2 code implementations • NeurIPS 2020 • Alexey A. Gritsenko, Tim Salimans, Rianne van den Berg, Jasper Snoek, Nal Kalchbrenner
Speech synthesis is an important practical generative modeling problem that has seen great progress over the last few years, with likelihood-based autoregressive neural models now outperforming traditional concatenative systems.
2 code implementations • 6 Apr 2020 • Shervin Minaee, Nal Kalchbrenner, Erik Cambria, Narjes Nikzad, Meysam Chenaghlu, Jianfeng Gao
Deep learning based models have surpassed classical machine learning based approaches in various text classification tasks, including sentiment analysis, news categorization, question answering, and natural language inference.
2 code implementations • 24 Mar 2020 • Casper Kaae Sønderby, Lasse Espeholt, Jonathan Heek, Mostafa Dehghani, Avital Oliver, Tim Salimans, Shreya Agrawal, Jason Hickey, Nal Kalchbrenner
Weather forecasting is a long standing scientific challenge with direct social and economic impact.
2 code implementations • 20 Dec 2019 • Jonathan Ho, Nal Kalchbrenner, Dirk Weissenborn, Tim Salimans
We propose Axial Transformers, a self-attention-based autoregressive model for images and other data organized as high dimensional tensors.
Ranked #29 on Image Generation on ImageNet 64x64 (Bits per dim metric)
no code implementations • 9 Aug 2019 • Jonathan Heek, Nal Kalchbrenner
We show that ATMC is intrinsically robust to overfitting on the training data and that ATMC provides a better calibrated measure of uncertainty compared to the optimization baseline.
no code implementations • ICLR 2019 • Jacob Menick, Nal Kalchbrenner
To address the latter challenge, we propose to use Multidimensional Upscaling to grow an image in both size and depth via intermediate stages utilising distinct SPNs.
Ranked #2 on Image Generation on CelebA 256x256
15 code implementations • WS 2018 • Ashish Vaswani, Samy Bengio, Eugene Brevdo, Francois Chollet, Aidan N. Gomez, Stephan Gouws, Llion Jones, Łukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer, Jakob Uszkoreit
Tensor2Tensor is a library for deep learning models that is well-suited for neural machine translation and includes the reference implementation of the state-of-the-art Transformer model.
16 code implementations • ICML 2018 • Nal Kalchbrenner, Erich Elsen, Karen Simonyan, Seb Noury, Norman Casagrande, Edward Lockhart, Florian Stimberg, Aaron van den Oord, Sander Dieleman, Koray Kavukcuoglu
The small number of weights in a Sparse WaveRNN makes it possible to sample high-fidelity audio on a mobile CPU in real time.
2 code implementations • ICML 2018 • Aaron van den Oord, Yazhe Li, Igor Babuschkin, Karen Simonyan, Oriol Vinyals, Koray Kavukcuoglu, George van den Driessche, Edward Lockhart, Luis C. Cobo, Florian Stimberg, Norman Casagrande, Dominik Grewe, Seb Noury, Sander Dieleman, Erich Elsen, Nal Kalchbrenner, Heiga Zen, Alex Graves, Helen King, Tom Walters, Dan Belov, Demis Hassabis
The recently-developed WaveNet architecture is the current state of the art in realistic speech synthesis, consistently rated as more natural sounding for many different languages than any previous system.
no code implementations • ICML 2017 • Scott Reed, Aäron van den Oord, Nal Kalchbrenner, Sergio Gómez Colmenarejo, Ziyu Wang, Dan Belov, Nando de Freitas
Our new PixelCNN model achieves competitive density estimation and orders of magnitude speedup - O(log N) sampling instead of O(N) - enabling the practical generation of 512x512 images.
Ranked #2 on Image Compression on ImageNet32
11 code implementations • 31 Oct 2016 • Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves, Koray Kavukcuoglu
The ByteNet is a one-dimensional convolutional neural network that is composed of two parts, one to encode the source sequence and the other to decode the target sequence.
Ranked #1 on Machine Translation on WMT2015 English-German
1 code implementation • ICML 2017 • Nal Kalchbrenner, Aaron van den Oord, Karen Simonyan, Ivo Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu
The VPN approaches the best possible performance on the Moving MNIST benchmark, a leap over the previous state of the art, and the generated videos show only minor deviations from the ground truth.
Ranked #1 on Video Prediction on KTH (Cond metric)
61 code implementations • 12 Sep 2016 • Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu
This paper introduces WaveNet, a deep neural network for generating raw audio waveforms.
Ranked #1 on Speech Synthesis on Mandarin Chinese
14 code implementations • NeurIPS 2016 • Aaron van den Oord, Nal Kalchbrenner, Oriol Vinyals, Lasse Espeholt, Alex Graves, Koray Kavukcuoglu
This work explores conditional image generation with a new image density model based on the PixelCNN architecture.
Ranked #7 on Density Estimation on CIFAR-10
3 code implementations • 9 Feb 2016 • Ivo Danihelka, Greg Wayne, Benigno Uria, Nal Kalchbrenner, Alex Graves
We investigate a new method to augment recurrent neural networks with extra memory without increasing the number of network parameters.
18 code implementations • 25 Jan 2016 • Aaron van den Oord, Nal Kalchbrenner, Koray Kavukcuoglu
Modeling the distribution of natural images is a landmark problem in unsupervised learning.
Ranked #5 on Image Generation on Binarized MNIST
1 code implementation • 6 Jul 2015 • Nal Kalchbrenner, Ivo Danihelka, Alex Graves
This paper introduces Grid Long Short-Term Memory, a network of LSTM cells arranged in a multidimensional grid that can be applied to vectors, sequences or higher dimensional data such as images.
no code implementations • ACL 2014 • Dimitri Kartsaklis, Nal Kalchbrenner, Mehrnoosh Sadrzadeh
This paper provides a method for improving tensor-based compositional distributional models of meaning by the addition of an explicit disambiguation step prior to composition.
no code implementations • 15 Jun 2014 • Misha Denil, Alban Demiraj, Nal Kalchbrenner, Phil Blunsom, Nando de Freitas
Capturing the compositional process which maps the meaning of words to that of documents is a central challenge for researchers in Natural Language Processing and Information Retrieval.
5 code implementations • ACL 2014 • Nal Kalchbrenner, Edward Grefenstette, Phil Blunsom
The ability to accurately represent sentences is central to language understanding.
no code implementations • WS 2013 • Nal Kalchbrenner, Phil Blunsom
The compositionality of meaning extends beyond the single sentence.