ERNIE-GEN: An Enhanced Multi-Flow Pre-training and Fine-tuning Framework for Natural Language Generation

26 Jan 2020Dongling XiaoHan ZhangYukun LiYu SunHao TianHua WuHaifeng Wang

Current pre-training works in natural language generation pay little attention to the problem of exposure bias on downstream tasks. To address this issue, we propose an enhanced multi-flow sequence to sequence pre-training and fine-tuning framework named ERNIE-GEN, which bridges the discrepancy between training and inference with an infilling generation mechanism and a noise-aware generation method... (read more)

PDF Abstract

Results from the Paper


 Ranked #1 on Text Summarization on GigaWord-10k (using extra training data)

     Get a GitHub badge
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK USES EXTRA
TRAINING DATA
RESULT BENCHMARK
Abstractive Text Summarization CNN / Daily Mail ERNIE-GENBASE ROUGE-1 42.30 # 7
ROUGE-2 19.92 # 6
ROUGE-L 39.68 # 7
Abstractive Text Summarization CNN / Daily Mail ERNIE-GENLARGE (large-scale text corpora) ROUGE-1 44.31 # 1
ROUGE-2 21.35 # 2
ROUGE-L 41.60 # 1
Abstractive Text Summarization CNN / Daily Mail ERNIE-GENLARGE ROUGE-1 44.02 # 4
ROUGE-2 21.17 # 3
ROUGE-L 41.26 # 3
Generative Question Answering CoQA ERNIE-GEN F1-Score 84.5 # 1
Text Summarization GigaWord ERNIE-GENBASE ROUGE-1 38.83 # 6
ROUGE-2 20.04 # 5
ROUGE-L 36.20 # 5
Text Summarization GigaWord ERNIE-GENLARGE (large-scale text corpora) ROUGE-1 39.46 # 2
ROUGE-2 20.34 # 2
ROUGE-L 36.74 # 1
Text Summarization GigaWord ERNIE-GENLARGE ROUGE-1 39.25 # 3
ROUGE-2 20.25 # 3
ROUGE-L 36.53 # 3
Text Summarization GigaWord-10k ERNIE-GENLARGE ROUGE-L 32.50 # 2
ROUGE-1 35.05 # 2
ROUGE-2 16.10 # 2
Text Summarization GigaWord-10k ERNIE-GENLARGE (large-scale text corpora) ROUGE-L 33.23 # 1
ROUGE-1 35.51 # 1
ROUGE-2 16.79 # 1
Text Summarization GigaWord-10k ERNIE-GENBASE ROUGE-L 31.35 # 3
ROUGE-1 33.75 # 3
ROUGE-2 15.23 # 3
Question Generation SQuAD1.1 ERNIE-GENLARGE (large-scale text corpora) BLEU-4 25.41 # 1
Question Generation SQuAD1.1 ERNIE-GENLARGE (beam size=5) BLEU-4 25.4 # 2

Methods used in the Paper


METHOD TYPE
🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet