TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Question Generation	SQuAD1.1	ProphetNet + ASGen	BLEU-4	24.44	# 3
Question Generation	SQuAD1.1	ProphetNet + ASGen	METEOR	26.73	# 1
Question Generation	SQuAD1.1	ProphetNet + ASGen	ROUGE-L	52.8	# 1
Question Generation	SQuAD1.1	UniLM + ASGen	BLEU-4	23.7	# 7
Question Generation	SQuAD1.1	UniLM + ASGen	METEOR	25.9	# 5
Question Generation	SQuAD1.1	UniLM + ASGen	ROUGE-L	52.3	# 4

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-to-generate-questions-by-recovering/question-generation-on-squad11)](https://paperswithcode.com/sota/question-generation-on-squad11?p=learning-to-generate-questions-by-recovering)`

Learning to Generate Questions by Recovering Answer-containing Sentences

1 Jan 2021 · Seohyun Back, Akhil Kedia, Sai Chetan Chinthakindi, Haejun Lee, Jaegul Choo ·

To train a question answering model based on machine reading comprehension (MRC), significant effort is required to prepare annotated training data composed of questions and their answers from contexts. To mitigate this issue, recent research has focused on synthetically generating a question from a given context and an annotated (or generated) answer by training an additional generative model, which can be utilized to augment the training data. In light of this research direction, we propose a novel pre-training approach that learns to generate contextually rich questions, by recovering answer-containing sentences. Our approach is composed of two novel components, (1) dynamically determining K answers from a given document and (2) pre-training the question generator on the task of generating the answer-containing sentence. We evaluate our method against existing ones in terms of the quality of generated questions as well as the fine-tuned MRC model accuracy after training on the data synthetically generated by our method. Experimental results demonstrate that our approach consistently improves the question generation capability of existing models such as UniLM, and shows state-of-the-art results on MS MARCO and NewsQA, and comparable results to the state-of-the-art on SQuAD. Additionally, we demonstrate that the data synthetically generated by our approach is beneficial for boosting up the downstream MRC accuracy across a wide range of datasets, such as SQuAD-v1.1, v2.0, and KorQuAD, without any modification to the existing MRC models. Furthermore, our experiments highlight that our method shines especially when a limited amount of training data is given, in terms of both pre-training and downstream MRC data.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Machine Reading Comprehension

Question Answering

Question Generation

Question-Generation

Reading Comprehension

Sentence

Datasets

SQuAD

MS MARCO

NewsQA

KorQuAD

Results from the Paper

Add Remove

Ranked #3 on Question Generation on SQuAD1.1 (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Question Generation	SQuAD1.1	ProphetNet + ASGen	BLEU-4	24.44	# 3	Compare
			METEOR	26.73	# 1	Compare
			ROUGE-L	52.8	# 1	Compare
Question Generation	SQuAD1.1	UniLM + ASGen	BLEU-4	23.7	# 7	Compare
			METEOR	25.9	# 5	Compare
			ROUGE-L	52.3	# 4	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Learning to Generate Questions by Recovering Answer-containing Sentences

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove