Reading Comprehension
568 papers with code • 7 benchmarks • 95 datasets
Most current question answering datasets frame the task as reading comprehension where the question is about a paragraph or document and the answer often is a span in the document.
Some specific tasks of reading comprehension include multi-modal machine reading comprehension and textual machine reading comprehension, among others. In the literature, machine reading comprehension can be divide into four categories: cloze style, multiple choice, span prediction, and free-form answer. Read more about each category here.
Benchmark datasets used for testing a model's reading comprehension abilities include MovieQA, ReCoRD, and RACE, among others.
The Machine Reading group at UCL also provides an overview of reading comprehension tasks.
Figure source: A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and Benchmark Datasets
Libraries
Use these libraries to find Reading Comprehension models and implementationsSubtasks
- Machine Reading Comprehension
- Intent Recognition
- Implicit Relations
- LAMBADA
- LAMBADA
- Question Selection
- Multi-Hop Reading Comprehension
- Implicatures
- Logical Reasoning Reading Comprehension
- English Proverbs
- Fantasy Reasoning
- Figure Of Speech Detection
- Formal Fallacies Syllogisms Negation
- GRE Reading Comprehension
- Hyperbaton
- Movie Dialog Same Or Different
- Nonsense Words Grammar
- Phrase Relatedness
- RACE-h
- RACE-m
Latest papers
AC-EVAL: Evaluating Ancient Chinese Language Understanding in Large Language Models
Given the importance of ancient Chinese in capturing the essence of rich historical and cultural heritage, the rapid advancements in Large Language Models (LLMs) necessitate benchmarks that can effectively evaluate their understanding of ancient contexts.
Video Relationship Detection Using Mixture of Experts
Secondly, classifiers trained by a single, monolithic neural network often lack stability and generalization.
Scaling Rectified Flow Transformers for High-Resolution Image Synthesis
Rectified flow is a recent generative model formulation that connects data and noise in a straight line.
PoTeC: A German Naturalistic Eye-tracking-while-reading Corpus
The Potsdam Textbook Corpus (PoTeC) is a naturalistic eye-tracking-while-reading corpus containing data from 75 participants reading 12 scientific texts.
Causal Orthogonalization: Multicollinearity, Economic Interpretability, and the Gram-Schmidt Process
This paper considers the problem of interpreting orthogonalization model coefficients.
VlogQA: Task, Dataset, and Baseline Models for Vietnamese Spoken-Based Machine Reading Comprehension
This paper presents the development process of a Vietnamese spoken language corpus for machine reading comprehension (MRC) tasks and provides insights into the challenges and opportunities associated with using real-world data for machine reading comprehension tasks.
An Information-Theoretic Approach to Analyze NLP Classification Tasks
This work provides an information-theoretic framework to analyse the influence of inputs for text classification tasks.
Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment
Nevertheless, we posit that LLMs inherently harbor role-play capabilities, owing to the extensive knowledge of characters and potential dialogues ingrained in their vast training corpora.
Knowledge Fusion of Large Language Models
In this paper, we introduce the notion of knowledge fusion for LLMs, aimed at combining the capabilities of existing LLMs and transferring them into a single LLM.
Improving Domain Adaptation through Extended-Text Reading Comprehension
To enhance the domain-specific capabilities of large language models, continued pre-training on a domain-specific corpus is a prevalent method.