Search Results for author: Björn Möller

Found 1 papers, 1 papers with code

Relaxed Attention for Transformer Models

1 code implementation • 20 Sep 2022 • Timo Lohrenz, Björn Möller, Zhengyang Li, Tim Fingscheidt

The powerful modeling capabilities of all-attention-based transformer architectures often cause overfitting and - for natural language processing tasks - lead to an implicitly learned internal language model in the autoregressive transformer decoder complicating the integration of external language models.

Ranked #3 on Lipreading on LRS3-TED (using extra training data)

Image Classification Language Modelling +3

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.