5 code implementations • DeepMind 2022 • Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katie Millican, Malcolm Reynolds, Roman Ring, Eliza Rutherford, Serkan Cabi, Tengda Han, Zhitao Gong, Sina Samangooei, Marianne Monteiro, Jacob Menick, Sebastian Borgeaud, Andrew Brock, Aida Nematzadeh, Sahand Sharifzadeh, Mikolaj Binkowski, Ricardo Barreira, Oriol Vinyals, Andrew Zisserman, Karen Simonyan
Building models that can be rapidly adapted to novel tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research.
Ranked #1 on Action Recognition on RareAct
2 code implementations • 29 Mar 2022 • Jordan Hoffmann, Sebastian Borgeaud, Arthur Mensch, Elena Buchatskaya, Trevor Cai, Eliza Rutherford, Diego de Las Casas, Lisa Anne Hendricks, Johannes Welbl, Aidan Clark, Tom Hennigan, Eric Noland, Katie Millican, George van den Driessche, Bogdan Damoc, Aurelia Guy, Simon Osindero, Karen Simonyan, Erich Elsen, Jack W. Rae, Oriol Vinyals, Laurent SIfre
We investigate the optimal model size and number of tokens for training a transformer language model under a given compute budget.
2 code implementations • 22 Feb 2022 • Joao Carreira, Skanda Koppula, Daniel Zoran, Adria Recasens, Catalin Ionescu, Olivier Henaff, Evan Shelhamer, Relja Arandjelovic, Matt Botvinick, Oriol Vinyals, Karen Simonyan, Andrew Zisserman, Andrew Jaegle
This however hinders them from scaling up to the inputs sizes required to process raw high-resolution images or video.
no code implementations • 2 Feb 2022 • Aidan Clark, Diego de Las Casas, Aurelia Guy, Arthur Mensch, Michela Paganini, Jordan Hoffmann, Bogdan Damoc, Blake Hechtman, Trevor Cai, Sebastian Borgeaud, George van den Driessche, Eliza Rutherford, Tom Hennigan, Matthew Johnson, Katie Millican, Albin Cassirer, Chris Jones, Elena Buchatskaya, David Budden, Laurent SIfre, Simon Osindero, Oriol Vinyals, Jack Rae, Erich Elsen, Koray Kavukcuoglu, Karen Simonyan
The performance of a language model has been shown to be effectively modeled as a power-law in its parameter count.
3 code implementations • NA 2021 • Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor, Irina Higgins, Antonia Creswell, Nat McAleese, Amy Wu, Erich Elsen, Siddhant Jayakumar, Elena Buchatskaya, David Budden, Esme Sutherland, Karen Simonyan, Michela Paganini, Laurent SIfre, Lena Martens, Xiang Lorraine Li, Adhiguna Kuncoro, Aida Nematzadeh, Elena Gribovskaya, Domenic Donato, Angeliki Lazaridou, Arthur Mensch, Jean-Baptiste Lespiau, Maria Tsimpoukelli, Nikolai Grigorev, Doug Fritz, Thibault Sottiaux, Mantas Pajarskas, Toby Pohlen, Zhitao Gong, Daniel Toyama, Cyprien de Masson d'Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew Johnson, Blake Hechtman, Laura Weidinger, Iason Gabriel, William Isaac, Ed Lockhart, Simon Osindero, Laura Rimell, Chris Dyer, Oriol Vinyals, Kareem Ayoub, Jeff Stanway, Lorrayne Bennett, Demis Hassabis, Koray Kavukcuoglu, Geoffrey Irving
Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.
Ranked #1 on Language Modelling on StackExchange
2 code implementations • 8 Dec 2021 • Sebastian Borgeaud, Arthur Mensch, Jordan Hoffmann, Trevor Cai, Eliza Rutherford, Katie Millican, George van den Driessche, Jean-Baptiste Lespiau, Bogdan Damoc, Aidan Clark, Diego de Las Casas, Aurelia Guy, Jacob Menick, Roman Ring, Tom Hennigan, Saffron Huang, Loren Maggiore, Chris Jones, Albin Cassirer, Andy Brock, Michela Paganini, Geoffrey Irving, Oriol Vinyals, Simon Osindero, Karen Simonyan, Jack W. Rae, Erich Elsen, Laurent SIfre
We enhance auto-regressive language models by conditioning on document chunks retrieved from a large corpus, based on local similarity with preceding tokens.
Ranked #1 on Language Modelling on WikiText-103 (using extra training data)
no code implementations • EMNLP 2021 • Rémi Leblond, Jean-Baptiste Alayrac, Laurent SIfre, Miruna Pislar, Jean-Baptiste Lespiau, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals
Beam search is the go-to method for decoding auto-regressive machine translation models.
2 code implementations • 2 Apr 2021 • Suman Ravuri, Karel Lenc, Matthew Willson, Dmitry Kangin, Remi Lam, Piotr Mirowski, Megan Fitzsimons, Maria Athanassiadou, Sheleem Kashem, Sam Madge, Rachel Prudden, Amol Mandhane, Aidan Clark, Andrew Brock, Karen Simonyan, Raia Hadsell, Niall Robinson, Ellen Clancy, Alberto Arribas, Shakir Mohamed
To address these challenges, we present a Deep Generative Model for the probabilistic nowcasting of precipitation from radar.
no code implementations • 10 Mar 2021 • Sander Dieleman, Charlie Nash, Jesse Engel, Karen Simonyan
Semantically meaningful information content in perceptual signals is usually unevenly distributed.
19 code implementations • 11 Feb 2021 • Andrew Brock, Soham De, Samuel L. Smith, Karen Simonyan
Batch normalization is a key component of most image classification models, but it has many undesirable properties stemming from its dependence on the batch size and interactions between examples.
Ranked #31 on Image Classification on ImageNet
no code implementations • ICLR 2021 • Jacob Menick, Erich Elsen, Utku Evci, Simon Osindero, Karen Simonyan, Alex Graves
For highly sparse networks, SnAp with $n=2$ remains tractable and can outperform backpropagation through time in terms of learning speed when updates are done online.
no code implementations • 12 Jun 2020 • Jacob Menick, Erich Elsen, Utku Evci, Simon Osindero, Karen Simonyan, Alex Graves
Current methods for training recurrent neural networks are based on backpropagation through time, which requires storing a complete history of network states, and prohibits updating the weights `online' (after every timestep).
1 code implementation • 12 Jun 2020 • Jordan Hoffmann, Simon Schmitt, Simon Osindero, Karen Simonyan, Erich Elsen
Neural networks have historically been built layerwise from the set of functions in ${f: \mathbb{R}^n \to \mathbb{R}^m }$, i. e. with activations and weights/parameters represented by real numbers, $\mathbb{R}$.
2 code implementations • ICLR 2021 • Jeff Donahue, Sander Dieleman, Mikołaj Bińkowski, Erich Elsen, Karen Simonyan
Modern text-to-speech synthesis pipelines typically involve multiple processing stages, each of which is designed or learnt independently from the rest.
8 code implementations • NeurIPS 2020 • Hanxiao Liu, Andrew Brock, Karen Simonyan, Quoc V. Le
Normalization layers and activation functions are fundamental components in deep networks and typically co-locate with each other.
no code implementations • 9 Mar 2020 • Pauline Luc, Aidan Clark, Sander Dieleman, Diego de Las Casas, Yotam Doron, Albin Cassirer, Karen Simonyan
Recent breakthroughs in adversarial generative modeling have led to models capable of producing video samples of high quality, even on large and complex datasets of real-world video.
Ranked #9 on Video Prediction on Kinetics-600 12 frames, 64x64
1 code implementation • 2 Dec 2019 • Yan Wu, Jeff Donahue, David Balduzzi, Karen Simonyan, Timothy Lillicrap
Training generative adversarial networks requires balancing of delicate adversarial dynamics.
5 code implementations • CVPR 2020 • Erich Elsen, Marat Dukhan, Trevor Gale, Karen Simonyan
Equipped with our efficient implementation of sparse primitives, we show that sparse versions of MobileNet v1, MobileNet v2 and EfficientNet architectures substantially outperform strong dense baselines on the efficiency-accuracy curve.
18 code implementations • 19 Nov 2019 • Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent SIfre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy Lillicrap, David Silver
When evaluated on Go, chess and shogi, without any knowledge of the game rules, MuZero matched the superhuman performance of the AlphaZero algorithm that was supplied with the game rules.
Ranked #1 on Atari Games on Atari 2600 Boxing
no code implementations • ICML 2020 • Simon Schmitt, Matteo Hessel, Karen Simonyan
We investigate the combination of actor-critic reinforcement learning algorithms with uniform large-scale experience replay and propose solutions for two challenges: (a) efficient actor-critic learning with experience replay (b) stability of off-policy learning where agents learn from other agents behaviour.
Ranked #6 on Atari Games on Atari-57
3 code implementations • ICLR 2020 • Mikołaj Bińkowski, Jeff Donahue, Sander Dieleman, Aidan Clark, Erich Elsen, Norman Casagrande, Luis C. Cobo, Karen Simonyan
However, their application in the audio domain has received limited attention, and autoregressive models, such as WaveNet, remain the state of the art in generative modelling of audio signals such as human speech.
1 code implementation • 15 Jul 2019 • Aidan Clark, Jeff Donahue, Karen Simonyan
Generative models of natural images have progressed towards high fidelity samples by the strong leveraging of scale.
Ranked #1 on Video Generation on Kinetics-600 12 frames, 128x128
3D Character Animation From A Single Photo Video Generation +1
4 code implementations • NeurIPS 2019 • Jeff Donahue, Karen Simonyan
We extensively evaluate the representation learning and generation capabilities of these BigBiGAN models, demonstrating that these generation-based models achieve the state of the art in unsupervised representation learning on ImageNet, as well as in unconditional image generation.
Ranked #10 on Contrastive Learning on imagenet-1k
no code implementations • 7 Jun 2019 • Karel Lenc, Erich Elsen, Tom Schaul, Karen Simonyan
While using ES for differentiable parameters is computationally impractical (although possible), we show that a hybrid approach is practically feasible in the case where the model has both differentiable and non-differentiable parameters.
no code implementations • 6 Mar 2019 • Jeffrey De Fauw, Sander Dieleman, Karen Simonyan
We show that autoregressive models conditioned on these representations can produce high-fidelity reconstructions of images, and that we can train autoregressive priors on these representations that produce samples with large-scale coherence.
1 code implementation • 4 Mar 2019 • Piotr Mirowski, Andras Banki-Horvath, Keith Anderson, Denis Teplyashin, Karl Moritz Hermann, Mateusz Malinowski, Matthew Koichi Grimes, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell
These datasets cannot be used for decision-making and reinforcement learning, however, and in general the perspective of navigation as an interactive learning task, where the actions and behaviours of a learning agent are learned simultaneously with the perception and planning, is relatively unsupported.
34 code implementations • ICLR 2019 • Andrew Brock, Jeff Donahue, Karen Simonyan
Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal.
Ranked #1 on Image Generation on CIFAR-10 (NFE metric)
Conditional Image Generation Vocal Bursts Intensity Prediction
5 code implementations • 10 Aug 2018 • Sageev Oore, Ian Simon, Sander Dieleman, Douglas Eck, Karen Simonyan
Music generation has generally been focused on either creating scores or interpreting them.
no code implementations • NeurIPS 2018 • Sander Dieleman, Aäron van den Oord, Karen Simonyan
It has been shown that autoregressive models excel at generating raw audio waveforms of speech, but when applied to music, we find them biased towards capturing local signal structure at the expense of modelling long-range correlations.
57 code implementations • ICLR 2019 • Hanxiao Liu, Karen Simonyan, Yiming Yang
This paper addresses the scalability challenge of architecture search by formulating the task in a differentiable manner.
4 code implementations • NeurIPS 2018 • Piotr Mirowski, Matthew Koichi Grimes, Mateusz Malinowski, Karl Moritz Hermann, Keith Anderson, Denis Teplyashin, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell
We present an interactive navigation environment that uses Google StreetView for its photographic content and worldwide coverage, and demonstrate that our learning method allows agents to learn to navigate multiple cities and to traverse to target destinations that may be kilometres away.
no code implementations • 10 Mar 2018 • Simon Schmitt, Jonathan J. Hudson, Augustin Zidek, Simon Osindero, Carl Doersch, Wojciech M. Czarnecki, Joel Z. Leibo, Heinrich Kuttler, Andrew Zisserman, Karen Simonyan, S. M. Ali Eslami
Our method places no constraints on the architecture of the teacher or student agents, and it regulates itself to allow the students to surpass their teachers in performance.
16 code implementations • ICML 2018 • Nal Kalchbrenner, Erich Elsen, Karen Simonyan, Seb Noury, Norman Casagrande, Edward Lockhart, Florian Stimberg, Aaron van den Oord, Sander Dieleman, Koray Kavukcuoglu
The small number of weights in a Sparse WaveRNN makes it possible to sample high-fidelity audio on a mobile CPU in real time.
2 code implementations • ICML 2018 • Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver
They are most typically solved by tree search algorithms that simulate ahead into the future, evaluate future states, and back-up those evaluations to the root of a search tree.
23 code implementations • ICML 2018 • Lasse Espeholt, Hubert Soyer, Remi Munos, Karen Simonyan, Volodymir Mnih, Tom Ward, Yotam Doron, Vlad Firoiu, Tim Harley, Iain Dunning, Shane Legg, Koray Kavukcuoglu
In this work we aim to solve a large collection of tasks using a single reinforcement learning agent with a single set of parameters.
Ranked #3 on Atari Games on Atari 2600 Skiing (using extra training data)
60 code implementations • 5 Dec 2017 • David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent SIfre, Dharshan Kumaran, Thore Graepel, Timothy Lillicrap, Karen Simonyan, Demis Hassabis
The game of chess is the most widely-studied domain in the history of artificial intelligence.
Ranked #1 on Game of Go on ELO Ratings
2 code implementations • ICML 2018 • Aaron van den Oord, Yazhe Li, Igor Babuschkin, Karen Simonyan, Oriol Vinyals, Koray Kavukcuoglu, George van den Driessche, Edward Lockhart, Luis C. Cobo, Florian Stimberg, Norman Casagrande, Dominik Grewe, Seb Noury, Sander Dieleman, Erich Elsen, Nal Kalchbrenner, Heiga Zen, Alex Graves, Helen King, Tom Walters, Dan Belov, Demis Hassabis
The recently-developed WaveNet architecture is the current state of the art in realistic speech synthesis, consistently rated as more natural sounding for many different languages than any previous system.
9 code implementations • 27 Nov 2017 • Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M. Czarnecki, Jeff Donahue, Ali Razavi, Oriol Vinyals, Tim Green, Iain Dunning, Karen Simonyan, Chrisantha Fernando, Koray Kavukcuoglu
Neural networks dominate the modern machine learning landscape, but their training and success still suffer from sensitivity to empirical choices of hyperparameters such as model architecture, loss function, and optimisation algorithm.
1 code implementation • ICLR 2018 • Hanxiao Liu, Karen Simonyan, Oriol Vinyals, Chrisantha Fernando, Koray Kavukcuoglu
We explore efficient neural architecture search methods and show that a simple yet powerful evolutionary algorithm can discover new architectures with excellent performance.
11 code implementations • 16 Aug 2017 • Oriol Vinyals, Timo Ewalds, Sergey Bartunov, Petko Georgiev, Alexander Sasha Vezhnevets, Michelle Yeo, Alireza Makhzani, Heinrich Küttler, John Agapiou, Julian Schrittwieser, John Quan, Stephen Gaffney, Stig Petersen, Karen Simonyan, Tom Schaul, Hado van Hasselt, David Silver, Timothy Lillicrap, Kevin Calderone, Paul Keet, Anthony Brunasso, David Lawrence, Anders Ekermo, Jacob Repp, Rodney Tsing
Finally, we present initial baseline results for canonical deep reinforcement learning agents applied to the StarCraft II domain.
Ranked #1 on Starcraft II on MoveToBeacon
13 code implementations • 19 May 2017 • Will Kay, Joao Carreira, Karen Simonyan, Brian Zhang, Chloe Hillier, Sudheendra Vijayanarasimhan, Fabio Viola, Tim Green, Trevor Back, Paul Natsev, Mustafa Suleyman, Andrew Zisserman
We describe the DeepMind Kinetics human action video dataset.
7 code implementations • ICML 2017 • Jesse Engel, Cinjon Resnick, Adam Roberts, Sander Dieleman, Douglas Eck, Karen Simonyan, Mohammad Norouzi
Generative models in vision have seen rapid progress due to algorithmic improvements and the availability of high-quality image datasets.
11 code implementations • 31 Oct 2016 • Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves, Koray Kavukcuoglu
The ByteNet is a one-dimensional convolutional neural network that is composed of two parts, one to encode the source sequence and the other to decode the target sequence.
Ranked #1 on Machine Translation on WMT2015 English-German
1 code implementation • ICML 2017 • Nal Kalchbrenner, Aaron van den Oord, Karen Simonyan, Ivo Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu
The VPN approaches the best possible performance on the Moving MNIST benchmark, a leap over the previous state of the art, and the generated videos show only minor deviations from the ground truth.
Ranked #1 on Video Prediction on KTH (Cond metric)
61 code implementations • 12 Sep 2016 • Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, Koray Kavukcuoglu
This paper introduces WaveNet, a deep neural network for generating raw audio waveforms.
Ranked #1 on Speech Synthesis on Mandarin Chinese
1 code implementation • NeurIPS 2015 • Guillaume Desjardins, Karen Simonyan, Razvan Pascanu, Koray Kavukcuoglu
We introduce Natural Neural Networks, a novel family of algorithms that speed up convergence by adapting their internal representation during training to improve conditioning of the Fisher matrix.
45 code implementations • NeurIPS 2015 • Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu
Convolutional Neural Networks define an exceptionally powerful class of models, but are still limited by the lack of ability to be spatially invariant to the input data in a computationally and parameter efficient manner.
no code implementations • 18 Dec 2014 • Max Jaderberg, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
We develop a representation suitable for the unconstrained recognition of words in natural images: the general case of no fixed lexicon and unknown length.
no code implementations • 4 Dec 2014 • Max Jaderberg, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
In this work we present an end-to-end system for text spotting -- localising and recognising text in natural scene images -- and text based image retrieval.
Ranked #15 on Scene Text Detection on ICDAR 2013
302 code implementations • 4 Sep 2014 • Karen Simonyan, Andrew Zisserman
In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting.
Ranked #2 on Classification on InDL
no code implementations • 17 Jul 2014 • Ken Chatfield, Karen Simonyan, Andrew Zisserman
We investigate the gains in precision and speed, that can be obtained by using Convolutional Networks (ConvNets) for on-the-fly retrieval - where classifiers are learnt at run time for a textual query from downloaded images, and used to rank large image or video datasets.
7 code implementations • NeurIPS 2014 • Karen Simonyan, Andrew Zisserman
Our architecture is trained and evaluated on the standard video actions benchmarks of UCF-101 and HMDB-51, where it is competitive with the state of the art.
1 code implementation • 9 Jun 2014 • Max Jaderberg, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
In this work we present a framework for the recognition of natural scene text.
Ranked #37 on Scene Text Recognition on SVT
no code implementations • CVPR 2014 • Omkar M. Parkhi, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
Our goal is to learn a compact, discriminative vector representation of a face track, suitable for the face recognition tasks of verification and classification.
no code implementations • CVPR 2014 • Andrea Vedaldi, Siddharth Mahendran, Stavros Tsogkas, Subhransu Maji, Ross Girshick, Juho Kannala, Esa Rahtu, Iasonas Kokkinos, Matthew B. Blaschko, David Weiss, Ben Taskar, Karen Simonyan, Naomi Saphra, Sammy Mohamed
We show that the collected data can be used to study the relation between part detection and attribute prediction by diagnosing the performance of classifiers that pool information from different parts of an object.
1 code implementation • 14 May 2014 • Ken Chatfield, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
In particular, we show that the data augmentation techniques commonly applied to CNN-based methods can also be applied to shallow methods, and result in an analogous performance boost.
22 code implementations • 20 Dec 2013 • Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
This paper addresses the visualisation of image classification models, learnt using deep Convolutional Networks (ConvNets).
no code implementations • NeurIPS 2013 • Karen Simonyan, Andrea Vedaldi, Andrew Zisserman
As massively parallel computations have become broadly available with modern GPUs, deep architectures trained on very large datasets have risen in popularity.