1 code implementation • DeepMind 2022 • Jean-Baptiste Alayrac, Jeff Donahue, Pauline Luc, Antoine Miech, Iain Barr, Yana Hasson, Karel Lenc, Arthur Mensch, Katie Millican, Malcolm Reynolds, Roman Ring, Eliza Rutherford, Serkan Cabi, Tengda Han, Zhitao Gong, Sina Samangooei, Marianne Monteiro, Jacob Menick, Sebastian Borgeaud, Andrew Brock, Aida Nematzadeh, Sahand Sharifzadeh, Mikolaj Binkowski, Ricardo Barreira, Oriol Vinyals, Andrew Zisserman, Karen Simonyan
Building models that can be rapidly adapted to novel tasks using only a handful of annotated examples is an open challenge for multimodal machine learning research.
Ranked #1 on Temporal/Casual QA on NExT-QA
1 code implementation • NeurIPS 2020 • Chongli Qin, Yan Wu, Jost Tobias Springenberg, Andrew Brock, Jeff Donahue, Timothy P. Lillicrap, Pushmeet Kohli
From this perspective, we hypothesise that instabilities in training GANs arise from the integration error in discretising the continuous dynamics.
2 code implementations • ICLR 2021 • Jeff Donahue, Sander Dieleman, Mikołaj Bińkowski, Erich Elsen, Karen Simonyan
Modern text-to-speech synthesis pipelines typically involve multiple processing stages, each of which is designed or learnt independently from the rest.
1 code implementation • 2 Dec 2019 • Yan Wu, Jeff Donahue, David Balduzzi, Karen Simonyan, Timothy Lillicrap
Training generative adversarial networks requires balancing of delicate adversarial dynamics.
3 code implementations • ICLR 2020 • Mikołaj Bińkowski, Jeff Donahue, Sander Dieleman, Aidan Clark, Erich Elsen, Norman Casagrande, Luis C. Cobo, Karen Simonyan
However, their application in the audio domain has received limited attention, and autoregressive models, such as WaveNet, remain the state of the art in generative modelling of audio signals such as human speech.
1 code implementation • 15 Jul 2019 • Aidan Clark, Jeff Donahue, Karen Simonyan
Generative models of natural images have progressed towards high fidelity samples by the strong leveraging of scale.
Ranked #1 on Video Generation on Kinetics-600 48 frames, 64x64
3D Character Animation From A Single Photo Video Generation +1
3 code implementations • NeurIPS 2019 • Jeff Donahue, Karen Simonyan
We extensively evaluate the representation learning and generation capabilities of these BigBiGAN models, demonstrating that these generation-based models achieve the state of the art in unsupervised representation learning on ImageNet, as well as in unconditional image generation.
Ranked #10 on Contrastive Learning on imagenet-1k
28 code implementations • ICLR 2019 • Andrew Brock, Jeff Donahue, Karen Simonyan
Despite recent progress in generative image modeling, successfully generating high-resolution, diverse samples from complex datasets such as ImageNet remains an elusive goal.
Ranked #3 on Conditional Image Generation on ArtBench-10 (32x32)
7 code implementations • 27 Nov 2017 • Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M. Czarnecki, Jeff Donahue, Ali Razavi, Oriol Vinyals, Tim Green, Iain Dunning, Karen Simonyan, Chrisantha Fernando, Koray Kavukcuoglu
Neural networks dominate the modern machine learning landscape, but their training and success still suffer from sensitivity to empirical choices of hyperparameters such as model architecture, loss function, and optimisation algorithm.
no code implementations • 15 Feb 2017 • Andrew Zhai, Dmitry Kislyuk, Yushi Jing, Michael Feng, Eric Tzeng, Jeff Donahue, Yue Li Du, Trevor Darrell
Over the past three years Pinterest has experimented with several visual search and recommendation services, including Related Pins (2014), Similar Looks (2015), Flashlight (2016) and Lens (2017).
11 code implementations • 31 May 2016 • Jeff Donahue, Philipp Krähenbühl, Trevor Darrell
The ability of the Generative Adversarial Networks (GANs) framework to learn generative models mapping from simple latent distributions to arbitrarily complex data distributions has been demonstrated empirically, with compelling results showing that the latent space of such generators captures semantic variation in the data distribution.
11 code implementations • CVPR 2016 • Deepak Pathak, Philipp Krahenbuhl, Jeff Donahue, Trevor Darrell, Alexei A. Efros
In order to succeed at this task, context encoders need to both understand the content of the entire image, as well as produce a plausible hypothesis for the missing part(s).
no code implementations • 28 Mar 2016 • Lisa Anne Hendricks, Zeynep Akata, Marcus Rohrbach, Jeff Donahue, Bernt Schiele, Trevor Darrell
Clearly explaining a rationale for a classification decision to an end-user can be as important as the decision itself.
2 code implementations • 21 Nov 2015 • Philipp Krähenbühl, Carl Doersch, Jeff Donahue, Trevor Darrell
Convolutional Neural Networks spread through computer vision like a wildfire, impacting almost all visual tasks imaginable.
no code implementations • 28 May 2015 • Yushi Jing, David Liu, Dmitry Kislyuk, Andrew Zhai, Jiajing Xu, Jeff Donahue, Sarah Tavel
We demonstrate that, with the availability of distributed computation platforms such as Amazon Web Services and open-source tools, it is possible for a small engineering team to build, launch and maintain a cost-effective, large-scale visual search system with widely available tools.
3 code implementations • 3 May 2015 • Subhashini Venugopalan, Marcus Rohrbach, Jeff Donahue, Raymond Mooney, Trevor Darrell, Kate Saenko
Our LSTM model is trained on video-sentence pairs and learns to associate a sequence of video frames to a sequence of words in order to generate a description of the event in the video clip.
1 code implementation • HLT 2015 • Subhashini Venugopalan, Huijuan Xu, Jeff Donahue, Marcus Rohrbach, Raymond Mooney, Kate Saenko
Solving the visual symbol grounding problem has long been a goal of artificial intelligence.
7 code implementations • CVPR 2015 • Jeff Donahue, Lisa Anne Hendricks, Marcus Rohrbach, Subhashini Venugopalan, Sergio Guadarrama, Kate Saenko, Trevor Darrell
Models based on deep convolutional networks have dominated recent image interpretation tasks; we investigate whether models which are also recurrent, or "temporally deep", are effective for tasks involving sequences, visual and otherwise.
Ranked #3 on Human Interaction Recognition on BIT
1 code implementation • NeurIPS 2014 • Judy Hoffman, Sergio Guadarrama, Eric Tzeng, Ronghang Hu, Jeff Donahue, Ross Girshick, Trevor Darrell, Kate Saenko
A major challenge in scaling object detection is the difficulty of obtaining labeled images for large numbers of categories.
no code implementations • 15 Jul 2014 • Ning Zhang, Jeff Donahue, Ross Girshick, Trevor Darrell
Semantic part localization can facilitate fine-grained categorization by explicitly isolating subtle appearance differences associated with specific object parts.
Ranked #62 on Fine-Grained Image Classification on CUB-200-2011
2 code implementations • 20 Jun 2014 • Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, Trevor Darrell
The framework is a BSD-licensed C++ library with Python and MATLAB bindings for training and deploying general-purpose convolutional neural networks and other deep models efficiently on commodity architectures.
no code implementations • 21 Dec 2013 • Judy Hoffman, Eric Tzeng, Jeff Donahue, Yangqing Jia, Kate Saenko, Trevor Darrell
In other words, are deep CNNs trained on large amounts of labeled data as susceptible to dataset bias as previous methods have been shown to be?
28 code implementations • CVPR 2014 • Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik
We find that R-CNN outperforms OverFeat by a large margin on the 200-class ILSVRC2013 detection dataset.
Ranked #26 on Object Detection on PASCAL VOC 2007
8 code implementations • 6 Oct 2013 • Jeff Donahue, Yangqing Jia, Oriol Vinyals, Judy Hoffman, Ning Zhang, Eric Tzeng, Trevor Darrell
We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks.
no code implementations • 20 Aug 2013 • Erik Rodner, Judy Hoffman, Jeff Donahue, Trevor Darrell, Kate Saenko
Images seen during test time are often not from the same distribution as images used for learning.
no code implementations • CVPR 2013 • Jeff Donahue, Judy Hoffman, Erik Rodner, Kate Saenko, Trevor Darrell
Most successful object classification and detection methods rely on classifiers trained on large labeled datasets.
no code implementations • 15 Jan 2013 • Judy Hoffman, Erik Rodner, Jeff Donahue, Trevor Darrell, Kate Saenko
We present an algorithm that learns representations which explicitly compensate for domain mismatch and which can be efficiently realized as linear classifiers.