Search Results for author: Telmo Pessoa Pires

Found 4 papers, 0 papers with code

One Wide Feedforward is All You Need

no code implementations4 Sep 2023 Telmo Pessoa Pires, António V. Lopes, Yannick Assogba, Hendra Setiawan

The Transformer architecture has two main non-embedding components: Attention and the Feed Forward Network (FFN).

Decoder Position

State Spaces Aren't Enough: Machine Translation Needs Attention

no code implementations25 Apr 2023 Ali Vardasbi, Telmo Pessoa Pires, Robin M. Schmidt, Stephan Peitz

Structured State Spaces for Sequences (S4) is a recently proposed sequence model with successful applications in various tasks, e. g. vision, language modeling, and audio.

Decoder Language Modelling +3

Cannot find the paper you are looking for? You can Submit a new open access paper.