GenWiki: A Dataset of 1.3 Million Content-Sharing Text and Graphs for Unsupervised Graph-to-Text Generation

COLING 2020  ·  Zhijing Jin, Qipeng Guo, Xipeng Qiu, Zheng Zhang ·

Data collection for the knowledge graph-to-text generation is expensive. As a result, research on unsupervised models has emerged as an active field recently. However, most unsupervised models have to use non-parallel versions of existing small supervised datasets, which largely constrain their potential. In this paper, we propose a large-scale, general-domain dataset, GenWiki. Our unsupervised dataset has 1.3M text and graph examples, respectively. With a human-annotated test set, we provide this new benchmark dataset for future research on unsupervised text generation from knowledge graphs.

PDF Abstract COLING 2020 PDF COLING 2020 Abstract

Datasets


Introduced in the Paper:

GenWiki

Used in the Paper:

WikiBio E2E RoboCup
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Unsupervised KG-to-Text Generation GenWiki (Fine) CycleGT_Base BLEU 41.59 # 1
METEOR 35.72 # 1
ROUGE-L 63.31 # 1
CIDEr 3.57 # 1
Unsupervised KG-to-Text Generation GenWiki (Fine) Rule-Based BLEU 13.45 # 5
METEOR 30.72 # 3
ROUGE-L 40.93 # 4
CIDEr 1.26 # 4
Unsupervised KG-to-Text Generation GenWiki (Fine) DirectTransfer BLEU 13.89 # 4
METEOR 25.76 # 5
ROUGE-L 39.75 # 5
CIDEr 1.26 # 4
Unsupervised KG-to-Text Generation GenWiki (Fine) NoisySupervised BLEU 30.12 # 3
METEOR 28.12 # 4
ROUGE-L 56.96 # 3
CIDEr 2.52 # 3
Unsupervised KG-to-Text Generation GenWiki (Fine) CycleGT_Warm BLEU 41.35 # 2
METEOR 35.20 # 2
ROUGE-L 63.01 # 2
CIDEr 3.45 # 2
Unsupervised KG-to-Text Generation GenWiki (Full) CycleGT_Warm BLEU 40.47 # 2
METEOR 34.84 # 2
ROUGE-L 63.40 # 2
CIDEr 3.48 # 2
Unsupervised KG-to-Text Generation GenWiki (Full) CycleGT_Base BLEU 41.29 # 1
METEOR 35.39 # 1
ROUGE-L 63.73 # 1
CIDEr 3.53 # 1
Unsupervised KG-to-Text Generation GenWiki (Full) Rule-Based BLEU 13.45 # 5
METEOR 30.72 # 4
ROUGE-L 40.93 # 4
CIDEr 1.26 # 4
Unsupervised KG-to-Text Generation GenWiki (Full) DirectTransfer BLEU 13.89 # 4
METEOR 25.76 # 5
ROUGE-L 39.75 # 5
CIDEr 1.26 # 4
Unsupervised KG-to-Text Generation GenWiki (Full) NoisySupervised BLEU 35.03 # 3
METEOR 33.45 # 3
ROUGE-L 58.14 # 3
CIDEr 2.63 # 3

Methods


No methods listed for this paper. Add relevant methods here