SCROLLS (Standardized CompaRison Over Long Language Sequences) is an NLP benchmark consisting of a suite of tasks that require reasoning over long texts. SCROLLS contains summarization, question answering, and natural language inference tasks, covering multiple domains, including literature, science, business, and entertainment. The dataset is made available in a unified text-to-text format and host a live leaderboard to facilitate research on model architecture and pretraining methods.
The SCROLLS benchmark contains the datasets GovReport, SummScreenFD, QMSum, QASPER, NarrativeQA, QuALITY and ContractNLI.
Paper | Code | Results | Date | Stars |
---|