Search Results for author: Jakob Uszkoreit

Found 29 papers, 13 papers with code

Scene Representation Transformer: Geometry-Free Novel View Synthesis Through Set-Latent Scene Representations

no code implementations25 Nov 2021 Mehdi S. M. Sajjadi, Henning Meyer, Etienne Pot, Urs Bergmann, Klaus Greff, Noha Radwan, Suhani Vora, Mario Lucic, Daniel Duckworth, Alexey Dosovitskiy, Jakob Uszkoreit, Thomas Funkhouser, Andrea Tagliasacchi

In this work, we propose the Scene Representation Transformer (SRT), a method which processes posed or unposed RGB images of a new area, infers a "set-latent scene representation", and synthesises novel views, all in a single feed-forward pass.

Novel View Synthesis Semantic Segmentation

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

4 code implementations18 Jun 2021 Andreas Steiner, Alexander Kolesnikov, Xiaohua Zhai, Ross Wightman, Jakob Uszkoreit, Lucas Beyer

Vision Transformers (ViT) have been shown to attain highly competitive performance for a wide range of vision applications, such as image classification, object detection and semantic image segmentation.

Data Augmentation Image Classification +2

Differentiable Patch Selection for Image Recognition

no code implementations CVPR 2021 Jean-Baptiste Cordonnier, Aravindh Mahendran, Alexey Dosovitskiy, Dirk Weissenborn, Jakob Uszkoreit, Thomas Unterthiner

Neural Networks require large amounts of memory and compute to process high resolution images, even when only a small part of the image is actually informative for the task at hand.

Traffic Sign Recognition

Towards End-to-End In-Image Neural Machine Translation

no code implementations EMNLP (nlpbt) 2020 Elman Mansimov, Mitchell Stern, Mia Chen, Orhan Firat, Jakob Uszkoreit, Puneet Jain

In this paper, we offer a preliminary investigation into the task of in-image machine translation: transforming an image containing text in one language into an image containing the same text in another language.

Machine Translation Translation

Object-Centric Learning with Slot Attention

3 code implementations NeurIPS 2020 Francesco Locatello, Dirk Weissenborn, Thomas Unterthiner, Aravindh Mahendran, Georg Heigold, Jakob Uszkoreit, Alexey Dosovitskiy, Thomas Kipf

Learning object-centric representations of complex scenes is a promising step towards enabling efficient abstract reasoning from low-level perceptual features.

Object Discovery

Scaling Autoregressive Video Models

1 code implementation ICLR 2020 Dirk Weissenborn, Oscar Täckström, Jakob Uszkoreit

Due to the statistical complexity of video, the high degree of inherent stochasticity, and the sheer amount of data, generating natural video remains a challenging task.

Action Recognition Video Generation +1

KERMIT: Generative Insertion-Based Modeling for Sequences

no code implementations4 Jun 2019 William Chan, Nikita Kitaev, Kelvin Guu, Mitchell Stern, Jakob Uszkoreit

During training, one can feed KERMIT paired data $(x, y)$ to learn the joint distribution $p(x, y)$, and optionally mix in unpaired data $x$ or $y$ to refine the marginals $p(x)$ or $p(y)$.

Machine Translation Question Answering +2

Insertion Transformer: Flexible Sequence Generation via Insertion Operations

no code implementations8 Feb 2019 Mitchell Stern, William Chan, Jamie Kiros, Jakob Uszkoreit

We present the Insertion Transformer, an iterative, partially autoregressive model for sequence generation based on insertion operations.

Machine Translation Translation

Blockwise Parallel Decoding for Deep Autoregressive Models

no code implementations NeurIPS 2018 Mitchell Stern, Noam Shazeer, Jakob Uszkoreit

Deep autoregressive sequence-to-sequence models have demonstrated impressive performance across a wide variety of tasks in recent years.

Image Super-Resolution Machine Translation +1

Music Transformer

10 code implementations ICLR 2019 Cheng-Zhi Anna Huang, Ashish Vaswani, Jakob Uszkoreit, Noam Shazeer, Ian Simon, Curtis Hawthorne, Andrew M. Dai, Matthew D. Hoffman, Monica Dinculescu, Douglas Eck

This is impractical for long sequences such as musical compositions since their memory complexity for intermediate relative information is quadratic in the sequence length.

Music Modeling

Universal Transformers

7 code implementations ICLR 2019 Mostafa Dehghani, Stephan Gouws, Oriol Vinyals, Jakob Uszkoreit, Łukasz Kaiser

Feed-forward and convolutional architectures have recently been shown to achieve superior results on some sequence modeling tasks such as machine translation, with the added advantage that they concurrently process all inputs in the sequence, leading to easy parallelization and faster training times.

Language Modelling Learning to Execute +2

Tensor2Tensor for Neural Machine Translation

15 code implementations WS 2018 Ashish Vaswani, Samy Bengio, Eugene Brevdo, Francois Chollet, Aidan N. Gomez, Stephan Gouws, Llion Jones, Łukasz Kaiser, Nal Kalchbrenner, Niki Parmar, Ryan Sepassi, Noam Shazeer, Jakob Uszkoreit

Tensor2Tensor is a library for deep learning models that is well-suited for neural machine translation and includes the reference implementation of the state-of-the-art Transformer model.

Machine Translation Translation

Fast Decoding in Sequence Models using Discrete Latent Variables

no code implementations ICML 2018 Łukasz Kaiser, Aurko Roy, Ashish Vaswani, Niki Parmar, Samy Bengio, Jakob Uszkoreit, Noam Shazeer

Finally, we evaluate our model end-to-end on the task of neural machine translation, where it is an order of magnitude faster at decoding than comparable autoregressive models.

Machine Translation Translation

Self-Attention with Relative Position Representations

9 code implementations NAACL 2018 Peter Shaw, Jakob Uszkoreit, Ashish Vaswani

On the WMT 2014 English-to-German and English-to-French translation tasks, this approach yields improvements of 1. 3 BLEU and 0. 3 BLEU over absolute position representations, respectively.

Machine Translation Translation

Image Transformer

no code implementations15 Feb 2018 Niki Parmar, Ashish Vaswani, Jakob Uszkoreit, Łukasz Kaiser, Noam Shazeer, Alexander Ku, Dustin Tran

Image generation has been successfully cast as an autoregressive sequence generation or transformation problem.

Ranked #6 on Image Generation on ImageNet 32x32 (bpd metric)

Image Generation Image Super-Resolution

Coarse-to-Fine Question Answering for Long Documents

no code implementations ACL 2017 Eunsol Choi, Daniel Hewlett, Jakob Uszkoreit, Illia Polosukhin, Alex Lacoste, re, Jonathan Berant

We present a framework for question answering that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models.

Question Answering Reading Comprehension +1

One Model To Learn Them All

1 code implementation16 Jun 2017 Lukasz Kaiser, Aidan N. Gomez, Noam Shazeer, Ashish Vaswani, Niki Parmar, Llion Jones, Jakob Uszkoreit

We present a single model that yields good results on a number of problems spanning multiple domains.

Image Captioning Image Classification +2

Attention Is All You Need

508 code implementations NeurIPS 2017 Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, Illia Polosukhin

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration.

Ranked #2 on Multimodal Machine Translation on Multi30K (BLUE (DE-EN) metric)

Abstractive Text Summarization Constituency Parsing +2

Hierarchical Question Answering for Long Documents

no code implementations6 Nov 2016 Eunsol Choi, Daniel Hewlett, Alexandre Lacoste, Illia Polosukhin, Jakob Uszkoreit, Jonathan Berant

We present a framework for question answering that can efficiently scale to longer documents while maintaining or even improving performance of state-of-the-art models.

Question Answering Reading Comprehension +1

Cannot find the paper you are looking for? You can Submit a new open access paper.