no code implementations • 2 Nov 2022 • Dimitris Mamakas, Petros Tsotsi, Ion Androutsopoulos, Ilias Chalkidis
Even sparse-attention models, such as Longformer and BigBird, which increase the maximum input length to 4, 096 sub-words, severely truncate texts in three of the six datasets of LexGLUE.