Search Results for author: Karen Simonyan

Found 58 papers, 41 papers with code

Deep Fisher Networks for Large-Scale Image Classification

no code implementations NeurIPS 2013 Karen Simonyan, Andrea Vedaldi, Andrew Zisserman

As massively parallel computations have become broadly available with modern GPUs, deep architectures trained on very large datasets have risen in popularity.

Classification General Classification +1

Deep Inside Convolutional Networks: Visualising Image Classification Models and Saliency Maps

21 code implementations20 Dec 2013 Karen Simonyan, Andrea Vedaldi, Andrew Zisserman

This paper addresses the visualisation of image classification models, learnt using deep Convolutional Networks (ConvNets).

General Classification Image Classification +2

Return of the Devil in the Details: Delving Deep into Convolutional Nets

1 code implementation14 May 2014 Ken Chatfield, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman

In particular, we show that the data augmentation techniques commonly applied to CNN-based methods can also be applied to shallow methods, and result in an analogous performance boost.

Data Augmentation object-detection +1

A Compact and Discriminative Face Track Descriptor

no code implementations CVPR 2014 Omkar M. Parkhi, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman

Our goal is to learn a compact, discriminative vector representation of a face track, suitable for the face recognition tasks of verification and classification.

Binarization Dimensionality Reduction +4

Understanding Objects in Detail with Fine-Grained Attributes

no code implementations CVPR 2014 Andrea Vedaldi, Siddharth Mahendran, Stavros Tsogkas, Subhransu Maji, Ross Girshick, Juho Kannala, Esa Rahtu, Iasonas Kokkinos, Matthew B. Blaschko, David Weiss, Ben Taskar, Karen Simonyan, Naomi Saphra, Sammy Mohamed

We show that the collected data can be used to study the relation between part detection and attribute prediction by diagnosing the performance of classifiers that pool information from different parts of an object.

Attribute Object +2

Two-Stream Convolutional Networks for Action Recognition in Videos

7 code implementations NeurIPS 2014 Karen Simonyan, Andrew Zisserman

Our architecture is trained and evaluated on the standard video actions benchmarks of UCF-101 and HMDB-51, where it is competitive with the state of the art.

Action Classification Action Recognition In Videos +6

Efficient On-the-fly Category Retrieval using ConvNets and GPUs

no code implementations17 Jul 2014 Ken Chatfield, Karen Simonyan, Andrew Zisserman

We investigate the gains in precision and speed, that can be obtained by using Convolutional Networks (ConvNets) for on-the-fly retrieval - where classifiers are learnt at run time for a textual query from downloaded images, and used to rank large image or video datasets.

Binarization Quantization +1

Very Deep Convolutional Networks for Large-Scale Image Recognition

299 code implementations4 Sep 2014 Karen Simonyan, Andrew Zisserman

In this work we investigate the effect of the convolutional network depth on its accuracy in the large-scale image recognition setting.

Activity Recognition In Videos Classification +4

Reading Text in the Wild with Convolutional Neural Networks

no code implementations4 Dec 2014 Max Jaderberg, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman

In this work we present an end-to-end system for text spotting -- localising and recognising text in natural scene images -- and text based image retrieval.

Image Retrieval Region Proposal +3

Deep Structured Output Learning for Unconstrained Text Recognition

no code implementations18 Dec 2014 Max Jaderberg, Karen Simonyan, Andrea Vedaldi, Andrew Zisserman

We develop a representation suitable for the unconstrained recognition of words in natural images: the general case of no fixed lexicon and unknown length.

Language Modelling Multi-Task Learning

Spatial Transformer Networks

44 code implementations NeurIPS 2015 Max Jaderberg, Karen Simonyan, Andrew Zisserman, Koray Kavukcuoglu

Convolutional Neural Networks define an exceptionally powerful class of models, but are still limited by the lack of ability to be spatially invariant to the input data in a computationally and parameter efficient manner.

Translation

Natural Neural Networks

1 code implementation NeurIPS 2015 Guillaume Desjardins, Karen Simonyan, Razvan Pascanu, Koray Kavukcuoglu

We introduce Natural Neural Networks, a novel family of algorithms that speed up convergence by adapting their internal representation during training to improve conditioning of the Fisher matrix.

Video Pixel Networks

1 code implementation ICML 2017 Nal Kalchbrenner, Aaron van den Oord, Karen Simonyan, Ivo Danihelka, Oriol Vinyals, Alex Graves, Koray Kavukcuoglu

The VPN approaches the best possible performance on the Moving MNIST benchmark, a leap over the previous state of the art, and the generated videos show only minor deviations from the ground truth.

 Ranked #1 on Video Prediction on KTH (Cond metric)

Video Prediction

Neural Machine Translation in Linear Time

11 code implementations31 Oct 2016 Nal Kalchbrenner, Lasse Espeholt, Karen Simonyan, Aaron van den Oord, Alex Graves, Koray Kavukcuoglu

The ByteNet is a one-dimensional convolutional neural network that is composed of two parts, one to encode the source sequence and the other to decode the target sequence.

Language Modelling Machine Translation +2

Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders

5 code implementations ICML 2017 Jesse Engel, Cinjon Resnick, Adam Roberts, Sander Dieleman, Douglas Eck, Karen Simonyan, Mohammad Norouzi

Generative models in vision have seen rapid progress due to algorithmic improvements and the availability of high-quality image datasets.

Audio Synthesis

Hierarchical Representations for Efficient Architecture Search

1 code implementation ICLR 2018 Hanxiao Liu, Karen Simonyan, Oriol Vinyals, Chrisantha Fernando, Koray Kavukcuoglu

We explore efficient neural architecture search methods and show that a simple yet powerful evolutionary algorithm can discover new architectures with excellent performance.

General Classification Image Classification +1

Population Based Training of Neural Networks

9 code implementations27 Nov 2017 Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M. Czarnecki, Jeff Donahue, Ali Razavi, Oriol Vinyals, Tim Green, Iain Dunning, Karen Simonyan, Chrisantha Fernando, Koray Kavukcuoglu

Neural networks dominate the modern machine learning landscape, but their training and success still suffer from sensitivity to empirical choices of hyperparameters such as model architecture, loss function, and optimisation algorithm.

Machine Translation Model Selection

Learning to Search with MCTSnets

2 code implementations ICML 2018 Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver

They are most typically solved by tree search algorithms that simulate ahead into the future, evaluate future states, and back-up those evaluations to the root of a search tree.

Kickstarting Deep Reinforcement Learning

no code implementations10 Mar 2018 Simon Schmitt, Jonathan J. Hudson, Augustin Zidek, Simon Osindero, Carl Doersch, Wojciech M. Czarnecki, Joel Z. Leibo, Heinrich Kuttler, Andrew Zisserman, Karen Simonyan, S. M. Ali Eslami

Our method places no constraints on the architecture of the teacher or student agents, and it regulates itself to allow the students to surpass their teachers in performance.

reinforcement-learning Reinforcement Learning (RL)

Learning to Navigate in Cities Without a Map

4 code implementations NeurIPS 2018 Piotr Mirowski, Matthew Koichi Grimes, Mateusz Malinowski, Karl Moritz Hermann, Keith Anderson, Denis Teplyashin, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell

We present an interactive navigation environment that uses Google StreetView for its photographic content and worldwide coverage, and demonstrate that our learning method allows agents to learn to navigate multiple cities and to traverse to target destinations that may be kilometres away.

Autonomous Navigation Navigate +2

The challenge of realistic music generation: modelling raw audio at scale

no code implementations NeurIPS 2018 Sander Dieleman, Aäron van den Oord, Karen Simonyan

It has been shown that autoregressive models excel at generating raw audio waveforms of speech, but when applied to music, we find them biased towards capturing local signal structure at the expense of modelling long-range correlations.

Music Generation

This Time with Feeling: Learning Expressive Musical Performance

5 code implementations10 Aug 2018 Sageev Oore, Ian Simon, Sander Dieleman, Douglas Eck, Karen Simonyan

Music generation has generally been focused on either creating scores or interpreting them.

Music Generation

Large Scale GAN Training for High Fidelity Natural Image Synthesis

33 code implementations ICLR 2019 Andrew Brock, Jeff Donahue, Karen Simonyan

Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal.

 Ranked #1 on Image Generation on CIFAR-10 (NFE metric)

Conditional Image Generation Vocal Bursts Intensity Prediction

The StreetLearn Environment and Dataset

1 code implementation4 Mar 2019 Piotr Mirowski, Andras Banki-Horvath, Keith Anderson, Denis Teplyashin, Karl Moritz Hermann, Mateusz Malinowski, Matthew Koichi Grimes, Karen Simonyan, Koray Kavukcuoglu, Andrew Zisserman, Raia Hadsell

These datasets cannot be used for decision-making and reinforcement learning, however, and in general the perspective of navigation as an interactive learning task, where the actions and behaviours of a learning agent are learned simultaneously with the perception and planning, is relatively unsupported.

Decision Making

Hierarchical Autoregressive Image Models with Auxiliary Decoders

no code implementations6 Mar 2019 Jeffrey De Fauw, Sander Dieleman, Karen Simonyan

We show that autoregressive models conditioned on these representations can produce high-fidelity reconstructions of images, and that we can train autoregressive priors on these representations that produce samples with large-scale coherence.

Non-Differentiable Supervised Learning with Evolution Strategies and Hybrid Methods

no code implementations7 Jun 2019 Karel Lenc, Erich Elsen, Tom Schaul, Karen Simonyan

While using ES for differentiable parameters is computationally impractical (although possible), we show that a hybrid approach is practically feasible in the case where the model has both differentiable and non-differentiable parameters.

Large Scale Adversarial Representation Learning

4 code implementations NeurIPS 2019 Jeff Donahue, Karen Simonyan

We extensively evaluate the representation learning and generation capabilities of these BigBiGAN models, demonstrating that these generation-based models achieve the state of the art in unsupervised representation learning on ImageNet, as well as in unconditional image generation.

Contrastive Learning Image Generation +4

Off-Policy Actor-Critic with Shared Experience Replay

no code implementations ICML 2020 Simon Schmitt, Matteo Hessel, Karen Simonyan

We investigate the combination of actor-critic reinforcement learning algorithms with uniform large-scale experience replay and propose solutions for two challenges: (a) efficient actor-critic learning with experience replay (b) stability of off-policy learning where agents learn from other agents behaviour.

Atari Games

High Fidelity Speech Synthesis with Adversarial Networks

3 code implementations ICLR 2020 Mikołaj Bińkowski, Jeff Donahue, Sander Dieleman, Aidan Clark, Erich Elsen, Norman Casagrande, Luis C. Cobo, Karen Simonyan

However, their application in the audio domain has received limited attention, and autoregressive models, such as WaveNet, remain the state of the art in generative modelling of audio signals such as human speech.

Generative Adversarial Network Speech Synthesis +1

Mastering Atari, Go, Chess and Shogi by Planning with a Learned Model

18 code implementations19 Nov 2019 Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent SIfre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, Timothy Lillicrap, David Silver

When evaluated on Go, chess and shogi, without any knowledge of the game rules, MuZero matched the superhuman performance of the AlphaZero algorithm that was supplied with the game rules.

Atari Games Atari Games 100k +3

Fast Sparse ConvNets

4 code implementations CVPR 2020 Erich Elsen, Marat Dukhan, Trevor Gale, Karen Simonyan

Equipped with our efficient implementation of sparse primitives, we show that sparse versions of MobileNet v1, MobileNet v2 and EfficientNet architectures substantially outperform strong dense baselines on the efficiency-accuracy curve.

Transformation-based Adversarial Video Prediction on Large-Scale Data

no code implementations9 Mar 2020 Pauline Luc, Aidan Clark, Sander Dieleman, Diego de Las Casas, Yotam Doron, Albin Cassirer, Karen Simonyan

Recent breakthroughs in adversarial generative modeling have led to models capable of producing video samples of high quality, even on large and complex datasets of real-world video.

Video Generation Video Prediction

Evolving Normalization-Activation Layers

8 code implementations NeurIPS 2020 Hanxiao Liu, Andrew Brock, Karen Simonyan, Quoc V. Le

Normalization layers and activation functions are fundamental components in deep networks and typically co-locate with each other.

Image Classification Image Generation +2

End-to-End Adversarial Text-to-Speech

2 code implementations ICLR 2021 Jeff Donahue, Sander Dieleman, Mikołaj Bińkowski, Erich Elsen, Karen Simonyan

Modern text-to-speech synthesis pipelines typically involve multiple processing stages, each of which is designed or learnt independently from the rest.

Adversarial Text Dynamic Time Warping +2

A Practical Sparse Approximation for Real Time Recurrent Learning

no code implementations12 Jun 2020 Jacob Menick, Erich Elsen, Utku Evci, Simon Osindero, Karen Simonyan, Alex Graves

Current methods for training recurrent neural networks are based on backpropagation through time, which requires storing a complete history of network states, and prohibits updating the weights `online' (after every timestep).

AlgebraNets

1 code implementation12 Jun 2020 Jordan Hoffmann, Simon Schmitt, Simon Osindero, Karen Simonyan, Erich Elsen

Neural networks have historically been built layerwise from the set of functions in ${f: \mathbb{R}^n \to \mathbb{R}^m }$, i. e. with activations and weights/parameters represented by real numbers, $\mathbb{R}$.

Computational Efficiency Image Classification +1

Practical Real Time Recurrent Learning with a Sparse Approximation

no code implementations ICLR 2021 Jacob Menick, Erich Elsen, Utku Evci, Simon Osindero, Karen Simonyan, Alex Graves

For highly sparse networks, SnAp with $n=2$ remains tractable and can outperform backpropagation through time in terms of learning speed when updates are done online.

High-Performance Large-Scale Image Recognition Without Normalization

19 code implementations11 Feb 2021 Andrew Brock, Soham De, Samuel L. Smith, Karen Simonyan

Batch normalization is a key component of most image classification models, but it has many undesirable properties stemming from its dependence on the batch size and interactions between examples.

Image Classification Vocal Bursts Intensity Prediction

Variable-rate discrete representation learning

no code implementations10 Mar 2021 Sander Dieleman, Charlie Nash, Jesse Engel, Karen Simonyan

Semantically meaningful information content in perceptual signals is usually unevenly distributed.

Representation Learning

Scaling Language Models: Methods, Analysis & Insights from Training Gopher

2 code implementations NA 2021 Jack W. Rae, Sebastian Borgeaud, Trevor Cai, Katie Millican, Jordan Hoffmann, Francis Song, John Aslanides, Sarah Henderson, Roman Ring, Susannah Young, Eliza Rutherford, Tom Hennigan, Jacob Menick, Albin Cassirer, Richard Powell, George van den Driessche, Lisa Anne Hendricks, Maribeth Rauh, Po-Sen Huang, Amelia Glaese, Johannes Welbl, Sumanth Dathathri, Saffron Huang, Jonathan Uesato, John Mellor, Irina Higgins, Antonia Creswell, Nat McAleese, Amy Wu, Erich Elsen, Siddhant Jayakumar, Elena Buchatskaya, David Budden, Esme Sutherland, Karen Simonyan, Michela Paganini, Laurent SIfre, Lena Martens, Xiang Lorraine Li, Adhiguna Kuncoro, Aida Nematzadeh, Elena Gribovskaya, Domenic Donato, Angeliki Lazaridou, Arthur Mensch, Jean-Baptiste Lespiau, Maria Tsimpoukelli, Nikolai Grigorev, Doug Fritz, Thibault Sottiaux, Mantas Pajarskas, Toby Pohlen, Zhitao Gong, Daniel Toyama, Cyprien de Masson d'Autume, Yujia Li, Tayfun Terzi, Vladimir Mikulik, Igor Babuschkin, Aidan Clark, Diego de Las Casas, Aurelia Guy, Chris Jones, James Bradbury, Matthew Johnson, Blake Hechtman, Laura Weidinger, Iason Gabriel, William Isaac, Ed Lockhart, Simon Osindero, Laura Rimell, Chris Dyer, Oriol Vinyals, Kareem Ayoub, Jeff Stanway, Lorrayne Bennett, Demis Hassabis, Koray Kavukcuoglu, Geoffrey Irving

Language modelling provides a step towards intelligent communication systems by harnessing large repositories of written human knowledge to better predict and understand the world.

Abstract Algebra Anachronisms +133

HiP: Hierarchical Perceiver

2 code implementations22 Feb 2022 Joao Carreira, Skanda Koppula, Daniel Zoran, Adria Recasens, Catalin Ionescu, Olivier Henaff, Evan Shelhamer, Relja Arandjelovic, Matt Botvinick, Oriol Vinyals, Karen Simonyan, Andrew Zisserman, Andrew Jaegle

This however hinders them from scaling up to the inputs sizes required to process raw high-resolution images or video.

Cannot find the paper you are looking for? You can Submit a new open access paper.