Long-range arena (LRA) is an effort toward systematic evaluation of efficient transformer models. The project aims at establishing benchmark tasks/datasets using which we can evaluate transformer-based models in a systematic way, by assessing their generalization power, computational efficiency, memory foot-print, etc. Long-Range Arena is specifically focused on evaluating model quality under long-context scenarios. The benchmark is a suite of tasks consisting of sequences ranging from 1K to 16K tokens, encompassing a wide range of data types and modalities such as text, natural, synthetic images, and mathematical expressions requiring similarity, structural, and visual-spatial reasoning.
140 PAPERS • 1 BENCHMARK
SCROLLS (Standardized CompaRison Over Long Language Sequences) is an NLP benchmark consisting of a suite of tasks that require reasoning over long texts. SCROLLS contains summarization, question answering, and natural language inference tasks, covering multiple domains, including literature, science, business, and entertainment. The dataset is made available in a unified text-to-text format and host a live leaderboard to facilitate research on model architecture and pretraining methods.
31 PAPERS • 1 BENCHMARK
MuLD (Multitask Long Document Benchmark) is a set of 6 NLP tasks where the inputs consist of at least 10,000 words. The benchmark covers a wide variety of task types including translation, summarization, question answering, and classification. Additionally there is a range of output lengths from a single word classification label all the way up to an output longer than the input text.
3 PAPERS • 6 BENCHMARKS
Pathfinder and Pathfinder-X have proven to be instrumental in training and testing Large Language Models with long-range dependencies. Recently, Meta's Moving Average Equipped Gated Attention model scored a 97% on the Pathfinder-X dataset, indicating a need for a larger, more challenging dataset. Whereas Pathfinder-X only went up to 256 x 256 pixel images (or a sequence length of 65,536 tokens), Pathfinder-X2 introduces images of 512 x 512 pixels, or 262,144 tokens.
0 PAPER • NO BENCHMARKS YET