526 papers with code • 13 benchmarks • 66 datasets
Text generation is the task of generating text with the goal of appearing indistinguishable to human-written text.
( Image credit: Adversarial Ranking for Language Generation )
Large pre-trained language models have been shown to store factual knowledge in their parameters, and achieve state-of-the-art results when fine-tuned on downstream NLP tasks.
Ranked #7 on Question Answering on Natural Questions (short)
Large transformer-based language models (LMs) trained on huge text corpora have shown unparalleled generation capabilities.
We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.
Ranked #3 on Text Summarization on X-Sum
Transformer architectures have facilitated building higher-capacity models and pretraining has made it possible to effectively utilize this capacity for a wide variety of tasks.
Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.
Ranked #1 on Language Modelling on enwik8 (using extra training data)
We propose encoder-centric stepwise models for extractive summarization using structured transformers -- HiBERT and Extended Transformers.
An ideal environment for evaluating dialog systems, also known as the Turing test, needs to involve human interaction, which is usually not affordable for large-scale experiments.
fairseq is an open-source sequence modeling toolkit that allows researchers and developers to train custom models for translation, summarization, language modeling, and other text generation tasks.