Language Models

Synthesizer

Introduced by Tay et al. in Synthesizer: Rethinking Self-Attention in Transformer Models

The Synthesizer is a model that learns synthetic attention weights without token-token interactions. Unlike Transformers, the model eschews dot product self-attention but also content-based self-attention altogether. Synthesizer learns to synthesize the self-alignment matrix instead of manually computing pairwise dot products. It is transformation-based, only relies on simple feed-forward layers, and completely dispenses with dot products and explicit token-token interactions.

This new module employed by the Synthesizer is called "Synthetic Attention": a new way of learning to attend without explicitly attending (i.e., without dot product attention or content-based attention). Instead, Synthesizer generate the alignment matrix independent of token-token dependencies.

Source: Synthesizer: Rethinking Self-Attention in Transformer Models

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Language Modeling 3 4.17%
Language Modelling 3 4.17%
Pose Estimation 3 4.17%
Object 2 2.78%
Synthetic Data Generation 2 2.78%
Zero-Shot Object Detection 2 2.78%
Voice Cloning 2 2.78%
Object Detection 2 2.78%
regression 2 2.78%

Categories