1 code implementation • • Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katie Millican, Malcolm Reynolds, Roman Ring, Eliza Rutherford, Serkan Cabi, Tengda Han, Zhitao Gong, Sina Samangooei, Marianne Monteiro, Jacob Menick, Sebastian Borgeaud, Andrew Brock, Aida Nematzadeh, Sahand Sharifzadeh, Mikolaj Binkowski, Ricardo Barreira, Oriol Vinyals, Andrew Zisserman, Karen Simonyan
Building models that can be rapidly adapted to numerous tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research.
Ranked #1 on Zero-Shot Learning on iVQA
7 code implementations • • Andrew Jaegle, Sebastian Borgeaud, Jean-Baptiste Alayrac, Carl Doersch, Catalin Ionescu, David Ding, Skanda Koppula, Daniel Zoran, Andrew Brock, Evan Shelhamer, Olivier Hénaff, Matthew M. Botvinick, Andrew Zisserman, Oriol Vinyals, Joāo Carreira
A central goal of machine learning is the development of systems that can solve many problems in as many data domains as possible.
Ranked #1 on Optical Flow Estimation on KITTI 2015 (Average End-Point Error metric)
In this work, we provide a detailed empirical evaluation of how the number of augmentation samples per unique image influences model performance on held out data when training deep ResNets.
Ranked #58 on Image Classification on ImageNet
1 code implementation • 2 Apr 2021 • Suman Ravuri, Karel Lenc, Matthew Willson, Dmitry Kangin, Remi Lam, Piotr Mirowski, Megan Fitzsimons, Maria Athanassiadou, Sheleem Kashem, Sam Madge, Rachel Prudden, Amol Mandhane, Aidan Clark, Andrew Brock, Karen Simonyan, Raia Hadsell, Niall Robinson, Ellen Clancy, Alberto Arribas, Shakir Mohamed
To address these challenges, we present a Deep Generative Model for the probabilistic nowcasting of precipitation from radar.
The perception models used in deep learning on the other hand are designed for individual modalities, often relying on domain-specific assumptions such as the local grid structures exploited by virtually all existing vision models.
Ranked #11 on Audio Classification on AudioSet
Batch normalization is a key component of most image classification models, but it has many undesirable properties stemming from its dependence on the batch size and interactions between examples.
Ranked #15 on Image Classification on ImageNet (using extra training data)
Batch Normalization is a key component in almost all state-of-the-art image classifiers, but it also introduces practical challenges: it breaks the independence between training examples within a batch, can incur compute and memory overhead, and often results in unexpected bugs.
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error in discretising the continuous dynamics.
3 code implementations • 20 Oct 2020 • Pierre H. Richemond, Jean-bastien Grill, Florent Altché, Corentin Tallec, Florian Strub, Andrew Brock, Samuel Smith, Soham De, Razvan Pascanu, Bilal Piot, Michal Valko
Bootstrap Your Own Latent (BYOL) is a self-supervised learning approach for image representation.
Normalization layers and activation functions are fundamental components in deep networks and typically co-locate with each other.
Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal.
Ranked #2 on Conditional Image Generation on ImageNet 128x128
Modern neural networks tend to be overconfident on unseen, noisy or incorrectly labelled data and do not produce meaningful uncertainty measures.
The increasingly photorealistic sample quality of generative image models suggests their feasibility in applications beyond image generation.
When working with three-dimensional data, choice of representation is key.
Ranked #6 on 3D Point Cloud Classification on ModelNet40 (Mean Accuracy metric)