Transformers

Reformer

Introduced by Kitaev et al. in Reformer: The Efficient Transformer

Reformer is a Transformer based architecture that seeks to make efficiency improvements. Dot-product attention is replaced by one that uses locality-sensitive hashing, changing its complexity from O($L^2$) to O($L\log L$), where $L$ is the length of the sequence. Furthermore, Reformers use reversible residual layers instead of the standard residuals, which allows storing activations only once in the training process instead of $N$ times, where $N$ is the number of layers.

Source: Reformer: The Efficient Transformer

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
Language Modelling 2 5.00%
Time Series Analysis 2 5.00%
Time Series Forecasting 2 5.00%
Sentence 2 5.00%
Survey 2 5.00%
Deep Learning 2 5.00%
Reinforcement Learning (RL) 2 5.00%
Deblurring 1 2.50%
Image Restoration 1 2.50%

Categories