Data-to-Text Generation
105 papers with code • 24 benchmarks • 22 datasets
A classic problem in natural-language generation (NLG) involves taking structured data, such as a table, as input, and producing text that adequately and fluently describes this data as output. Unlike machine translation, which aims for complete transduction of the sentence to be translated, this form of NLG is usually taken to require addressing (at least) two separate challenges: what to say, the selection of an appropriate subset of the input data to discuss, and how to say it, the surface realization of a generation.
( Image credit: Data-to-Text Generation with Content Selection and Planning )
Libraries
Use these libraries to find Data-to-Text Generation models and implementationsMost implemented papers
Semantically Conditioned Dialog Response Generation via Hierarchical Disentangled Self-Attention
Semantically controlled neural response generation on limited-domain has achieved great performance.
Data-to-text Generation with Entity Modeling
Recent approaches to data-to-text generation have shown great promise thanks to the use of large-scale datasets and the application of neural network architectures which are trained end-to-end.
Learning to Select, Track, and Generate for Data-to-Text
We propose a data-to-text generation model with two modules, one for tracking and the other for text generation.
Long and Diverse Text Generation with Planning-based Hierarchical Variational Model
Existing neural methods for data-to-text generation are still struggling to produce long and diverse texts: they are insufficient to model input data dynamically during generation, to capture inter-sentence coherence, or to generate diversified expressions.
Few-shot Natural Language Generation for Task-Oriented Dialog
It is pre-trained on a large set of annotated NLG corpus to acquire the controllable generation ability, and fine-tuned with only a few domain-specific labels to adapt to new domains.
Text-to-Text Pre-Training for Data-to-Text Tasks
We study the pre-train + fine-tune strategy for data-to-text tasks.
Data-QuestEval: A Referenceless Metric for Data-to-Text Semantic Evaluation
QuestEval is a reference-less metric used in text-to-text tasks, that compares the generated summaries directly to the source text, by automatically asking and answering questions.
Plan-then-Generate: Controlled Data-to-Text Generation via Planning
However, the lack of ability of neural models to control the structure of generated output can be limiting in certain real-world applications.
Control Prefixes for Parameter-Efficient Text Generation
Prefix-tuning is a powerful lightweight technique for adapting a large pre-trained language model to a downstream application.
Chart-to-Text: A Large-Scale Benchmark for Chart Summarization
We also introduce a number of state-of-the-art neural models as baselines that utilize image captioning and data-to-text generation techniques to tackle two problem variations: one assumes the underlying data table of the chart is available while the other needs to extract data from chart images.