GenWiki: A Dataset of 1.3 Million Content-Sharing Text and Graphs for Unsupervised Graph-to-Text Generation
Data collection for the knowledge graph-to-text generation is expensive. As a result, research on unsupervised models has emerged as an active field recently. However, most unsupervised models have to use non-parallel versions of existing small supervised datasets, which largely constrain their potential. In this paper, we propose a large-scale, general-domain dataset, GenWiki. Our unsupervised dataset has 1.3M text and graph examples, respectively. With a human-annotated test set, we provide this new benchmark dataset for future research on unsupervised text generation from knowledge graphs.
PDF Abstract COLING 2020 PDF COLING 2020 AbstractCode
Results from the Paper
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Unsupervised KG-to-Text Generation | GenWiki (Fine) | CycleGT_Base | BLEU | 41.59 | # 1 | |
METEOR | 35.72 | # 1 | ||||
ROUGE-L | 63.31 | # 1 | ||||
CIDEr | 3.57 | # 1 | ||||
Unsupervised KG-to-Text Generation | GenWiki (Fine) | Rule-Based | BLEU | 13.45 | # 5 | |
METEOR | 30.72 | # 3 | ||||
ROUGE-L | 40.93 | # 4 | ||||
CIDEr | 1.26 | # 4 | ||||
Unsupervised KG-to-Text Generation | GenWiki (Fine) | DirectTransfer | BLEU | 13.89 | # 4 | |
METEOR | 25.76 | # 5 | ||||
ROUGE-L | 39.75 | # 5 | ||||
CIDEr | 1.26 | # 4 | ||||
Unsupervised KG-to-Text Generation | GenWiki (Fine) | NoisySupervised | BLEU | 30.12 | # 3 | |
METEOR | 28.12 | # 4 | ||||
ROUGE-L | 56.96 | # 3 | ||||
CIDEr | 2.52 | # 3 | ||||
Unsupervised KG-to-Text Generation | GenWiki (Fine) | CycleGT_Warm | BLEU | 41.35 | # 2 | |
METEOR | 35.20 | # 2 | ||||
ROUGE-L | 63.01 | # 2 | ||||
CIDEr | 3.45 | # 2 | ||||
Unsupervised KG-to-Text Generation | GenWiki (Full) | CycleGT_Warm | BLEU | 40.47 | # 2 | |
METEOR | 34.84 | # 2 | ||||
ROUGE-L | 63.40 | # 2 | ||||
CIDEr | 3.48 | # 2 | ||||
Unsupervised KG-to-Text Generation | GenWiki (Full) | CycleGT_Base | BLEU | 41.29 | # 1 | |
METEOR | 35.39 | # 1 | ||||
ROUGE-L | 63.73 | # 1 | ||||
CIDEr | 3.53 | # 1 | ||||
Unsupervised KG-to-Text Generation | GenWiki (Full) | Rule-Based | BLEU | 13.45 | # 5 | |
METEOR | 30.72 | # 4 | ||||
ROUGE-L | 40.93 | # 4 | ||||
CIDEr | 1.26 | # 4 | ||||
Unsupervised KG-to-Text Generation | GenWiki (Full) | DirectTransfer | BLEU | 13.89 | # 4 | |
METEOR | 25.76 | # 5 | ||||
ROUGE-L | 39.75 | # 5 | ||||
CIDEr | 1.26 | # 4 | ||||
Unsupervised KG-to-Text Generation | GenWiki (Full) | NoisySupervised | BLEU | 35.03 | # 3 | |
METEOR | 33.45 | # 3 | ||||
ROUGE-L | 58.14 | # 3 | ||||
CIDEr | 2.63 | # 3 |