1 code implementation • CVPR 2022 • Mehdi S. M. Sajjadi, Henning Meyer, Etienne Pot, Urs Bergmann, Klaus Greff, Noha Radwan, Suhani Vora, Mario Lucic, Daniel Duckworth, Alexey Dosovitskiy, Jakob Uszkoreit, Thomas Funkhouser, Andrea Tagliasacchi
In this work, we propose the Scene Representation Transformer (SRT), a method which processes posed or unposed RGB images of a new area, infers a "set-latent scene representation", and synthesises novel views, all in a single feed-forward pass.
16 code implementations • 18 Jun 2021 • Andreas Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, Lucas Beyer
Vision Transformers (ViT) have been shown to attain highly competitive performance for a wide range of vision applications, such as image classification, object detection and semantic image segmentation.
48 code implementations • NeurIPS 2021 • Ilya Tolstikhin, Neil Houlsby, Alexander Kolesnikov, Lucas Beyer, Xiaohua Zhai, Thomas Unterthiner, Jessica Yung, Andreas Steiner, Daniel Keysers, Jakob Uszkoreit, Mario Lucic, Alexey Dosovitskiy
Convolutional Neural Networks (CNNs) are the go-to model for computer vision.
Ranked #17 on
Image Classification
on OmniBenchmark
no code implementations • CVPR 2021 • Jean-Baptiste Cordonnier, Aravindh Mahendran, Alexey Dosovitskiy, Dirk Weissenborn, Jakob Uszkoreit, Thomas Unterthiner
Neural Networks require large amounts of memory and compute to process high resolution images, even when only a small part of the image is actually informative for the task at hand.
150 code implementations • ICLR 2021 • Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, Jakob Uszkoreit, Neil Houlsby
While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited.
Ranked #1 on
Image Classification
on CIFAR-10
(using extra training data)
no code implementations • EMNLP (nlpbt) 2020 • Elman Mansimov, Mitchell Stern, Mia Chen, Orhan Firat, Jakob Uszkoreit, Puneet Jain
In this paper, we offer a preliminary investigation into the task of in-image machine translation: transforming an image containing text in one language into an image containing the same text in another language.
8 code implementations • NeurIPS 2020 • Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf
Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features.
no code implementations • EMNLP 2020 • William Chan, Mitchell Stern, Jamie Kiros, Jakob Uszkoreit
In this work, we present an empirical study of generation order for machine translation.
1 code implementation • ICLR 2020 • Dirk Weissenborn, Oscar Täckström, Jakob Uszkoreit
Due to the statistical complexity of video, the high degree of inherent stochasticity, and the sheer amount of data, generating natural video remains a challenging task.
Ranked #7 on
Video Generation
on BAIR Robot Pushing
no code implementations • 4 Jun 2019 • William Chan, Nikita Kitaev, Kelvin Guu, Mitchell Stern, Jakob Uszkoreit
During training, one can feed KERMIT paired data $(x, y)$ to learn the joint distribution $p(x, y)$, and optionally mix in unpaired data $x$ or $y$ to refine the marginals $p(x)$ or $p(y)$.
Ranked #38 on
Machine Translation
on WMT2014 English-German
1 code implementation • Transactions of the Association of Computational Linguistics 2019 • Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, Kristina Toutanova, Llion Jones, Matthew Kelcey, Ming-Wei Chang, Andrew M. Dai, Jakob Uszkoreit, Quoc Le, Slav Petrov
The public release consists of 307, 373 training examples with single annotations, 7, 830 examples with 5-way annotations for development data, and a further 7, 842 examples 5-way annotated sequestered as test data.
Ranked #7 on
Question Answering
on Natural Questions (long)
no code implementations • 8 Feb 2019 • Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit
We present the Insertion Transformer, an iterative, partially autoregressive model for sequence generation based on insertion operations.
no code implementations • NeurIPS 2018 • Mitchell Stern, Noam Shazeer, Jakob Uszkoreit
Deep autoregressive sequence-to-sequence models have demonstrated impressive performance across a wide variety of tasks in recent years.
11 code implementations • ICLR 2019 • Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, Douglas Eck
This is impractical for long sequences such as musical compositions since their memory complexity for intermediate relative information is quadratic in the sequence length.
Ranked #3 on
Music Modeling
on JSB Chorales
8 code implementations • ICLR 2019 • Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob Uszkoreit, Łukasz Kaiser
Feed-forward and convolutional architectures have recently been shown to achieve superior results on some sequence modeling tasks such as machine translation, with the added advantage that they concurrently process all inputs in the sequence, leading to easy parallelization and faster training times.
Ranked #30 on
Language Modelling
on LAMBADA
15 code implementations • WS 2018 • Ashish Vaswani, Samy Bengio, Eugene Brevdo, Francois Chollet, Aidan N. Gomez, Stephan Gouws, Llion Jones, Łukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer, Jakob Uszkoreit
Tensor2Tensor is a library for deep learning models that is well-suited for neural machine translation and includes the reference implementation of the state-of-the-art Transformer model.
no code implementations • ICML 2018 • Łukasz Kaiser, Aurko Roy, Ashish Vaswani, Niki Parmar, Samy Bengio, Jakob Uszkoreit, Noam Shazeer
Finally, we evaluate our model end-to-end on the task of neural machine translation, where it is an order of magnitude faster at decoding than comparable autoregressive models.
12 code implementations • NAACL 2018 • Peter Shaw, Jakob Uszkoreit, Ashish Vaswani
On the WMT 2014 English-to-German and English-to-French translation tasks, this approach yields improvements of 1. 3 BLEU and 0. 3 BLEU over absolute position representations, respectively.
Ranked #22 on
Machine Translation
on WMT2014 English-French
no code implementations • 15 Feb 2018 • Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Łukasz Kaiser, Noam Shazeer, Alexander Ku, Dustin Tran
Image generation has been successfully cast as an autoregressive sequence generation or transformation problem.
Ranked #4 on
Density Estimation
on CIFAR-10
no code implementations • ICLR 2018 • Lukasz Kaiser, Aidan N. Gomez, Noam Shazeer, Ashish Vaswani, Niki Parmar, Llion Jones, Jakob Uszkoreit
We present a single model that yields good results on a number of problems spanning multiple domains.
no code implementations • ACL 2017 • Eunsol Choi, Daniel Hewlett, Jakob Uszkoreit, Illia Polosukhin, Alex Lacoste, re, Jonathan Berant
We present a framework for question answering that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models.
1 code implementation • 16 Jun 2017 • Lukasz Kaiser, Aidan N. Gomez, Noam Shazeer, Ashish Vaswani, Niki Parmar, Llion Jones, Jakob Uszkoreit
We present a single model that yields good results on a number of problems spanning multiple domains.
582 code implementations • NeurIPS 2017 • Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin
The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration.
Ranked #2 on
Multimodal Machine Translation
on Multi30K
(BLUE (DE-EN) metric)
no code implementations • WS 2017 • Gaurav Singh Tomar, Thyago Duque, Oscar Täckström, Jakob Uszkoreit, Dipanjan Das
We present a solution to the problem of paraphrase identification of questions.
Ranked #15 on
Paraphrase Identification
on Quora Question Pairs
(Accuracy metric)
no code implementations • 6 Nov 2016 • Eunsol Choi, Daniel Hewlett, Alexandre Lacoste, Illia Polosukhin, Jakob Uszkoreit, Jonathan Berant
We present a framework for question answering that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models.
10 code implementations • EMNLP 2016 • Ankur P. Parikh, Oscar Täckström, Dipanjan Das, Jakob Uszkoreit
We propose a simple neural architecture for natural language inference.
Ranked #48 on
Natural Language Inference
on SNLI