Text Generation

1454 papers with code • 167 benchmarks • 149 datasets

Text Generation is the task of generating text with the goal of appearing indistinguishable to human-written text. This task if more formally known as "natural language generation" in the literature.

Text generation can be addressed with Markov processes or deep generative models like LSTMs. Recently, some of the most advanced methods for text generation include BART, GPT and other GAN-based approaches. Text generation systems are evaluated either through human ratings or automatic evaluation metrics like METEOR, ROUGE, and BLEU.

Further readings:

( Image credit: Adversarial Ranking for Language Generation )

Libraries

Use these libraries to find Text Generation models and implementations
10 papers
123,180
6 papers
205

Most implemented papers

Unified Language Model Pre-training for Natural Language Understanding and Generation

microsoft/unilm NeurIPS 2019

This paper presents a new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks.

HuggingFace's Transformers: State-of-the-art Natural Language Processing

huggingface/transformers 9 Oct 2019

Transformer architectures have facilitated building higher-capacity models and pretraining has made it possible to effectively utilize this capacity for a wide variety of tasks.

Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks

huggingface/transformers NeurIPS 2020

Large pre-trained language models have been shown to store factual knowledge in their parameters, and achieve state-of-the-art results when fine-tuned on downstream NLP tasks.

MASS: Masked Sequence to Sequence Pre-training for Language Generation

microsoft/MASS 7 May 2019

Pre-training and fine-tuning, e. g., BERT, have achieved great success in language understanding by transferring knowledge from rich-resource pre-training task to the low/zero-resource downstream tasks.

CTRL: A Conditional Transformer Language Model for Controllable Generation

PaddlePaddle/PaddleNLP Preprint 2019

Large-scale language models show promising text generation capabilities, but users cannot easily control particular aspects of the generated text.

Plug and Play Language Models: A Simple Approach to Controlled Text Generation

uber-research/PPLM ICLR 2020

Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities.

NLG Evaluation Metrics Beyond Correlation Analysis: An Empirical Metric Preference Checklist

inimah/metric-preference-checklist 15 May 2023

Our proposed framework provides access: (i) for verifying whether automatic metrics are faithful to human preference, regardless of their correlation level to human; and (ii) for inspecting the strengths and limitations of NLG systems via pairwise evaluation.

A Hierarchical Neural Autoencoder for Paragraphs and Documents

jiweil/Hierarchical-Neural-Autoencoder IJCNLP 2015

Natural language generation of coherent long texts like paragraphs or longer documents is a challenging problem for recurrent networks models.

Sequence-to-Sequence Learning as Beam-Search Optimization

harvardnlp/BSO EMNLP 2016

In this work, we introduce a model and beam-search training scheme, based on the work of Daume III and Marcu (2005), that extends seq2seq to learn global sequence scores.

Boundary-Seeking Generative Adversarial Networks

eriklindernoren/PyTorch-GAN 27 Feb 2017

We introduce a method for training GANs with discrete data that uses the estimated difference measure from the discriminator to compute importance weights for generated samples, thus providing a policy gradient for training the generator.