Search Results for author: Daniele Coppola

Found 1 papers, 0 papers with code

Rethinking Attention: Exploring Shallow Feed-Forward Neural Networks as an Alternative to Attention Layers in Transformers

no code implementations17 Nov 2023 Vukasin Bozic, Danilo Dordevic, Daniele Coppola, Joseph Thommes, Sidak Pal Singh

This work presents an analysis of the effectiveness of using standard shallow feed-forward networks to mimic the behavior of the attention mechanism in the original Transformer model, a state-of-the-art architecture for sequence-to-sequence tasks.

Knowledge Distillation

Cannot find the paper you are looking for? You can Submit a new open access paper.