Multi-Document Summarization

93 papers with code • 5 benchmarks • 15 datasets

Multi-Document Summarization is a process of representing a set of documents with a short piece of text by capturing the relevant information and filtering out the redundant information. Two prominent approaches to Multi-Document Summarization are extractive and abstractive summarization. Extractive summarization systems aim to extract salient snippets, sentences or passages from documents, while abstractive summarization systems aim to concisely paraphrase the content of the documents.

Source: Multi-Document Summarization using Distributed Bag-of-Words Model

Benchmarks

Add a Result

These leaderboards are used to track progress in Multi-Document Summarization

Dataset	Best Model	Compare
Multi-News	PRIMER	See all
DUC 2004	GCN: Personalized Discourse Graph	See all
review	solar	See all
WCEP	PRIMER	See all
MS^2	led-base-16384-ms2	See all

Datasets

Most implemented papers

Most implemented Social Latest No code

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

allenai/primer • • ACL 2022

We introduce PRIMERA, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data.

Paper
Code

Proposition-Level Clustering for Multi-Document Summarization

oriern/procluster • • NAACL 2022

Text clustering methods were traditionally incorporated into multi-document summarization (MDS) as a means for coping with considerable information repetition.

Paper
Code

Clustering Sentences with Density Peaks for Multi-document Summarization

pvgladkov/density-peaks-sentence-clustering • HLT 2015

Paper
Code

AllSummarizer system at MultiLing 2015: Multilingual single and multi-document summarization

kariminf/AllSummarizer • WS 2015

Paper
Code

MDSWriter: Annotation Tool for Creating High-Quality Multi-Document Summarization Corpora

UKPLab/mdswriter • ACL 2016

Paper
Code

A General Optimization Framework for Multi-Document Summarization Using Genetic Algorithms and Swarm Intelligence

UKPLab/coling2016-genetic-swarm-MDS • COLING 2016

Extracting summaries via integer linear programming and submodularity are popular and successful techniques in extractive multi-document summarization.

Paper
Code

Bridging the gap between extractive and abstractive summaries: Creation and evaluation of coherent extracts from heterogeneous sources

AIPHES/DBS • COLING 2016

Coherent extracts are a novel type of summary combining the advantages of manually created abstractive summaries, which are fluent but difficult to evaluate, and low-quality automatically created extractive summaries, which lack coherence and structure.

Paper
Code

The Next Step for Multi-Document Summarization: A Heterogeneous Multi-Genre Corpus Built with a Novel Construction Approach

AIPHES/hMDS • COLING 2016

In a detailed analysis, we show that our new corpus is significantly different from the homogeneous corpora commonly used, and that it is heterogeneous along several dimensions.

Paper
Code

Bringing Structure into Summaries: Crowdsourcing a Benchmark Corpus of Concept Maps

UKPLab/emnlp2017-cmapsum-corpus • EMNLP 2017

Concept maps can be used to concisely represent important information and bring structure into large document collections.

Paper
Code

Beyond Generic Summarization: A Multi-faceted Hierarchical Summarization Corpus of Large Heterogeneous Data

AIPHES/HierarchicalSummarization • LREC 2018

Paper
Code

Multi-Document Summarization

Benchmarks Add a Result

Datasets

Most implemented papers

Content

Benchmarks

Add a Result