TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Abstractive Text Summarization	CNN / Daily Mail	ProphetNet	ROUGE-1	44.20	# 17
Abstractive Text Summarization	CNN / Daily Mail	ProphetNet	ROUGE-2	21.17	# 18
Abstractive Text Summarization	CNN / Daily Mail	ProphetNet	ROUGE-L	41.30	# 13
Text Summarization	GigaWord	ProphetNet	ROUGE-1	39.51	# 7
Text Summarization	GigaWord	ProphetNet	ROUGE-2	20.42	# 7
Text Summarization	GigaWord	ProphetNet	ROUGE-L	36.69	# 8
Question Generation	SQuAD1.1	ProphetNet	BLEU-4	23.91	# 6
Question Generation	SQuAD1.1	ProphetNet	METEOR	26.6	# 3
Question Generation	SQuAD1.1	ProphetNet	ROUGE-L	52.3	# 4

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/prophetnet-predicting-future-n-gram-for/question-generation-on-squad11)](https://paperswithcode.com/sota/question-generation-on-squad11?p=prophetnet-predicting-future-n-gram-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/prophetnet-predicting-future-n-gram-for/text-summarization-on-gigaword)](https://paperswithcode.com/sota/text-summarization-on-gigaword?p=prophetnet-predicting-future-n-gram-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/prophetnet-predicting-future-n-gram-for/abstractive-text-summarization-on-cnn-daily)](https://paperswithcode.com/sota/abstractive-text-summarization-on-cnn-daily?p=prophetnet-predicting-future-n-gram-for)`

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training

13 Jan 2020 · Weizhen Qi, Yu Yan, Yeyun Gong, Dayiheng Liu, Nan Duan, Jiusheng Chen, Ruofei Zhang, Ming Zhou ·

This paper presents a new sequence-to-sequence pre-training model called ProphetNet, which introduces a novel self-supervised objective named future n-gram prediction and the proposed n-stream self-attention mechanism. Instead of optimizing one-step-ahead prediction in the traditional sequence-to-sequence model, the ProphetNet is optimized by n-step ahead prediction that predicts the next n tokens simultaneously based on previous context tokens at each time step. The future n-gram prediction explicitly encourages the model to plan for the future tokens and prevent overfitting on strong local correlations. We pre-train ProphetNet using a base scale dataset (16GB) and a large-scale dataset (160GB), respectively. Then we conduct experiments on CNN/DailyMail, Gigaword, and SQuAD 1.1 benchmarks for abstractive summarization and question generation tasks. Experimental results show that ProphetNet achieves new state-of-the-art results on all these datasets compared to the models using the same scale pre-training corpus.

PDF Abstract

Code

Add Remove Mark official

microsoft/ProphetNet official

616

huggingface/transformers

124,793

microsoft/ar2

d294270681/ProphetNet-paddle

Tasks

Add Remove

Abstractive Text Summarization

Question Generation

Question-Generation

Text Summarization

Datasets

SQuAD

CNN/Daily Mail

BookCorpus

Results from the Paper

Edit

Ranked #6 on Question Generation on SQuAD1.1 (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Abstractive Text Summarization	CNN / Daily Mail	ProphetNet	ROUGE-1	44.20	# 17	Compare
			ROUGE-2	21.17	# 18	Compare
			ROUGE-L	41.30	# 13	Compare
Text Summarization	GigaWord	ProphetNet	ROUGE-1	39.51	# 7	Compare
			ROUGE-2	20.42	# 7	Compare
			ROUGE-L	36.69	# 8	Compare
Question Generation	SQuAD1.1	ProphetNet	BLEU-4	23.91	# 6	Compare
			METEOR	26.6	# 3	Compare
			ROUGE-L	52.3	# 4	Compare

Methods

Add Remove

ProphetNet

Edit Social Preview

ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove