WikiGraphs is a dataset of Wikipedia articles each paired with a knowledge graph, to facilitate the research in conditional text generation, graph generation and graph representation learning. Existing graph-text paired datasets typically contain small graphs and short text (1 or few sentences), thus limiting the capabilities of the models that can be learned on the data. WikiGraphs is collected by pairing each Wikipedia article from the established WikiText-103 benchmark with a subgraph from the Freebase knowledge graph. Both the graphs and the text data are of significantly larger scale compared to prior graph-text paired datasets.
3 PAPERS • 1 BENCHMARK
EventNarrative is a knowledge graph-to-text dataset from publicly available open-world knowledge graphs. EventNarrative consists of approximately 230,000 graphs and their corresponding natural language text.
2 PAPERS • 1 BENCHMARK
GenWiki is a large-scale dataset for knowledge graph-to-text (G2T) and text-to-knowledge graph (T2G) conversion. It is introduced in the paper "GenWiki: A Dataset of 1.3 Million Content-Sharing Text and Graphs for Unsupervised Graph-to-Text Generation" by Zhijing Jin, Qipeng Guo, Xipeng Qiu, and Zheng Zhang at COLING
7 PAPERS • 2 BENCHMARKS
ENT-DESC involves retrieving abundant knowledge of various types of main entities from a large knowledge graph (KG), which makes the current graph-to-sequence models severely suffer from the problems of
1 PAPER • 1 BENCHMARK
Abstract GENeration DAtaset (AGENDA) is a dataset of knowledge graphs paired with scientific abstracts.
19 PAPERS • 1 BENCHMARK