Extended Transformer Construction, or ETC, is an extension of the Transformer architecture with a new attention mechanism that extends the original in two main ways: (1) it allows scaling up the input length from 512 to several thousands; and (2) it can ingesting structured inputs instead of just linear sequences. The key ideas that enable ETC to achieve these are a new global-local attention mechanism, coupled with relative position encodings. ETC also allows lifting weights from existing BERT models, saving computational resources while training.
Source: ETC: Encoding Long and Structured Inputs in TransformersPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Language Modelling | 2 | 6.90% |
Question Answering | 2 | 6.90% |
Reinforcement Learning (RL) | 2 | 6.90% |
Multi-Armed Bandits | 2 | 6.90% |
Thompson Sampling | 2 | 6.90% |
Management | 1 | 3.45% |
Pseudo Label | 1 | 3.45% |
Traffic Classification | 1 | 3.45% |
Disease Prediction | 1 | 3.45% |