Reading Comprehension

409 papers with code • 6 benchmarks • 89 datasets

Most current question answering datasets frame the task as reading comprehension where the question is about a paragraph or document and the answer often is a span in the document.

Some specific tasks of reading comprehension include multi-modal machine reading comprehension and textual machine reading comprehension, among others. In the literature, machine reading comprehension can be divide into four categories: cloze style, multiple choice, span prediction, and free-form answer. Read more about each category here.

Benchmark datasets used for testing a model's reading comprehension abilities include MovieQA, ReCoRD, and RACE, among others.

The Machine Reading group at UCL also provides an overview of reading comprehension tasks.

Figure source: A Survey on Machine Reading Comprehension: Tasks, Evaluation Metrics and Benchmark Datasets


Use these libraries to find Reading Comprehension models and implementations
4 papers
3 papers
See all 9 libraries.

Most implemented papers

RoBERTa: A Robustly Optimized BERT Pretraining Approach

pytorch/fairseq 26 Jul 2019

Language model pretraining has led to significant performance gains but careful comparison between different approaches is challenging.

XLNet: Generalized Autoregressive Pretraining for Language Understanding

zihangdai/xlnet NeurIPS 2019

With the capability of modeling bidirectional contexts, denoising autoencoding based pretraining like BERT achieves better performance than pretraining approaches based on autoregressive language modeling.

Bidirectional Attention Flow for Machine Comprehension

allenai/bi-att-flow 5 Nov 2016

Machine comprehension (MC), answering a query about a given context paragraph, requires modeling complex interactions between the context and the query.

Towards AI-Complete Question Answering: A Set of Prerequisite Toy Tasks

facebook/bAbI-tasks 19 Feb 2015

One long-term goal of machine learning research is to produce methods that are applicable to reasoning and natural language, in particular building an intelligent dialogue agent.

SQuAD: 100,000+ Questions for Machine Comprehension of Text

worksheets/0xd53d03a4 EMNLP 2016

We present the Stanford Question Answering Dataset (SQuAD), a new reading comprehension dataset consisting of 100, 000+ questions posed by crowdworkers on a set of Wikipedia articles, where the answer to each question is a segment of text from the corresponding reading passage.

QANet: Combining Local Convolution with Global Self-Attention for Reading Comprehension

BangLiu/QANet-PyTorch ICLR 2018

On the SQuAD dataset, our model is 3x to 13x faster in training and 4x to 9x faster in inference, while achieving equivalent accuracy to recurrent models.

Teaching Machines to Read and Comprehend

deepmind/rc-data NeurIPS 2015

Teaching machines to read natural language documents remains an elusive challenge.

MS MARCO: A Human Generated MAchine Reading COmprehension Dataset

dfcf93/MSMARCO 28 Nov 2016

The size of the dataset and the fact that the questions are derived from real user search queries distinguishes MS MARCO from other well-known publicly available datasets for machine reading comprehension and question-answering.

Know What You Don't Know: Unanswerable Questions for SQuAD

worksheets/0x9a15a170 ACL 2018

Extractive reading comprehension systems can often locate the correct answer to a question in a context document, but they also tend to make unreliable guesses on questions for which the correct answer is not stated in the context.

Language Models are Unsupervised Multitask Learners

PaddlePaddle/PaddleNLP Preprint 2019

Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.