Transformers

Extended Transformer Construction

Introduced by Ainslie et al. in ETC: Encoding Long and Structured Inputs in Transformers

Extended Transformer Construction, or ETC, is an extension of the Transformer architecture with a new attention mechanism that extends the original in two main ways: (1) it allows scaling up the input length from 512 to several thousands; and (2) it can ingesting structured inputs instead of just linear sequences. The key ideas that enable ETC to achieve these are a new global-local attention mechanism, coupled with relative position encodings. ETC also allows lifting weights from existing BERT models, saving computational resources while training.

Source: ETC: Encoding Long and Structured Inputs in Transformers

Papers


Paper Code Results Date Stars

Categories