Document Summarization

195 papers with code • 7 benchmarks • 28 datasets

Automatic Document Summarization is the task of rewriting a document into its shorter form while still retaining its important content. The most popular two paradigms are extractive approaches and abstractive approaches. Extractive approaches generate summaries by extracting parts of the original document (usually sentences), while abstractive methods may generate new words or phrases which are not in the original document.

Source: HIBERT: Document Level Pre-training of Hierarchical Bidirectional Transformers for Document Summarization

Benchmarks

Add a Result

These leaderboards are used to track progress in Document Summarization

Dataset	Best Model	Compare
CNN / Daily Mail	Scrambled code + broken (alter)	See all
HowSumm-Step	LexRank (query: step title)	See all
HowSumm-Method	LexRank (query: method + article + steps titles)	See all
BBC XSum	BigBird-Pegasus	See all
Arxiv HEP-TH citation graph	DeepPyramidion	See all
arXiv Summarization Dataset	DeepPyramidion	See all
WikiLingua (tr->en)	DOCmT5	See all

Libraries

Use these libraries to find Document Summarization models and implementations

huggingface/transformers

3 papers

124,984

thudm/swissarmytransformer

2 papers

842

HHousen/TransformerSum

2 papers

425

shashiongithub/XSum

2 papers

338

See all 6 libraries.

Datasets

Subtasks

Email Thread Summarization

Most implemented papers

Most implemented Social Latest No code

Get To The Point: Summarization with Pointer-Generator Networks

abisee/pointer-generator • • ACL 2017

Neural sequence-to-sequence models have provided a viable new approach for abstractive text summarization (meaning they are not restricted to simply selecting and rearranging passages from the original text).

Paper
Code

Text Summarization with Pretrained Encoders

nlpyang/PreSumm • • IJCNLP 2019

For abstractive summarization, we propose a new fine-tuning schedule which adopts different optimizers for the encoder and the decoder as a means of alleviating the mismatch between the two (the former is pretrained while the latter is not).

Paper
Code

Language Models are Unsupervised Multitask Learners

openai/gpt-2 • • Preprint 2019

Natural language processing tasks, such as question answering, machine translation, reading comprehension, and summarization, are typically approached with supervised learning on taskspecific datasets.

Paper
Code

Unified Language Model Pre-training for Natural Language Understanding and Generation

microsoft/unilm • • NeurIPS 2019

This paper presents a new Unified pre-trained Language Model (UniLM) that can be fine-tuned for both natural language understanding and generation tasks.

Paper
Code

GLM: General Language Model Pretraining with Autoregressive Blank Infilling

THUDM/GLM • • ACL 2022

On a wide range of tasks across NLU, conditional and unconditional generation, GLM outperforms BERT, T5, and GPT given the same model sizes and data, and achieves the best performance from a single pretrained model with 1. 25x parameters of BERT Large , demonstrating its generalizability to different downstream tasks.

Paper
Code

SummaRuNNer: A Recurrent Neural Network based Sequence Model for Extractive Summarization of Documents

kedz/nnsum • • 14 Nov 2016

We present SummaRuNNer, a Recurrent Neural Network (RNN) based sequence model for extractive summarization of documents and show that it achieves performance better than or comparable to state-of-the-art.

Paper
Code

Bottom-Up Abstractive Summarization

sebastianGehrmann/bottom-up-summary • EMNLP 2018

We use this selector as a bottom-up attention step to constrain the model to likely phrases.

Paper
Code

Extending Context Window of Large Language Models via Positional Interpolation

pku-yuangroup/open-sora-plan • • 27 Jun 2023

We present Position Interpolation (PI) that extends the context window sizes of RoPE-based pretrained LLMs such as LLaMA models to up to 32768 with minimal fine-tuning (within 1000 steps), while demonstrating strong empirical results on various tasks that require long context, including passkey retrieval, language modeling, and long document summarization from LLaMA 7B to 65B.

Paper
Code

Generating Wikipedia by Summarizing Long Sequences

tensorflow/tensor2tensor • • ICLR 2018

We show that generating English Wikipedia articles can be approached as a multi- document summarization of source documents.

Paper
Code

Don't Give Me the Details, Just the Summary! Topic-Aware Convolutional Neural Networks for Extreme Summarization

shashiongithub/XSum • • EMNLP 2018

We introduce extreme summarization, a new single-document summarization task which does not favor extractive strategies and calls for an abstractive modeling approach.

Paper
Code

Document Summarization

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Most implemented papers

Content

Benchmarks

Add a Result