Abstractive Text Summarization

325 papers with code • 19 benchmarks • 48 datasets

Abstractive Text Summarization is the task of generating a short and concise summary that captures the salient ideas of the source text. The generated summaries potentially contain new phrases and sentences that may not appear in the source text.

Source: Generative Adversarial Network for Abstractive Text Summarization

Image credit: Abstractive Text Summarization using Sequence-to-sequence RNNs and Beyond

Benchmarks

Add a Result

These leaderboards are used to track progress in Abstractive Text Summarization

Dataset	Best Model	Compare
CNN / Daily Mail	Pegasus	See all
Abstractive Text Summarization from Il Post	mBART	See all
Abstractive Text Summarization from Fanpage	mBART	See all
EDUsum	Seq2seq	See all
MLSum-it	mBART	See all
WITS	BART-IT	See all
vietnews	ViT5 large	See all
AESLC	PEGASUS	See all
CNN/Daily Mail	BART (TextBox 2.0)	See all
WikiHow	BertSum	See all
MLSUM de	mBART	See all
MLSUM es	mBART	See all
Inshorts News	T2SAM	See all

Show all 19 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find Abstractive Text Summarization models and implementations

huggingface/transformers

5 papers

125,059

microsoft/unilm

5 papers

18,335

theamrzaki/text_summurization_abstr…

5 papers

518

pytorch/fairseq

3 papers

29,255

See all 16 libraries.

Datasets

Subtasks

Most implemented papers

Most implemented Social Latest No code

Attention Is All You Need

tensorflow/tensor2tensor • • NeurIPS 2017

The dominant sequence transduction models are based on complex recurrent or convolutional neural networks in an encoder-decoder configuration.

567

Paper
Code

BART: Denoising Sequence-to-Sequence Pre-training for Natural Language Generation, Translation, and Comprehension

huggingface/transformers • • ACL 2020

We evaluate a number of noising approaches, finding the best performance by both randomly shuffling the order of the original sentences and using a novel in-filling scheme, where spans of text are replaced with a single mask token.

Paper
Code

Get To The Point: Summarization with Pointer-Generator Networks

abisee/pointer-generator • • ACL 2017

Neural sequence-to-sequence models have provided a viable new approach for abstractive text summarization (meaning they are not restricted to simply selecting and rearranging passages from the original text).

Paper
Code

Text Summarization with Pretrained Encoders

nlpyang/PreSumm • • IJCNLP 2019

For abstractive summarization, we propose a new fine-tuning schedule which adopts different optimizers for the encoder and the decoder as a means of alleviating the mismatch between the two (the former is pretrained while the latter is not).

Paper
Code

PEGASUS: Pre-training with Extracted Gap-sentences for Abstractive Summarization

google-research/pegasus • • ICML 2020

Recent work pre-training Transformers with self-supervised objectives on large text corpora has shown great success when fine-tuned on downstream NLP tasks including text summarization.

Paper
Code

A Deep Reinforced Model for Abstractive Summarization

theamrzaki/text_summurization_abstractive_methods • • ICLR 2018

We introduce a neural network model with a novel intra-attention that attends over the input and continuously generated output separately, and a new training method that combines standard supervised word prediction and reinforcement learning (RL).

Paper
Code

Unified Language Model Pre-training for Natural Language Understanding and Generation

microsoft/unilm • • NeurIPS 2019

This paper presents a new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks.

Paper
Code

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

THUDM/GLM • • ACL 2022

On a wide range of tasks across NLU, conditional and unconditional generation, GLM outperforms BERT, T5, and GPT given the same model sizes and data, and achieves the best performance from a single pretrained model with 1. 25x parameters of BERT Large , demonstrating its generalizability to different downstream tasks.

Paper
Code