Multi-Document Summarization

93 papers with code • 5 benchmarks • 15 datasets

Multi-Document Summarization is a process of representing a set of documents with a short piece of text by capturing the relevant information and filtering out the redundant information. Two prominent approaches to Multi-Document Summarization are extractive and abstractive summarization. Extractive summarization systems aim to extract salient snippets, sentences or passages from documents, while abstractive summarization systems aim to concisely paraphrase the content of the documents.

Source: Multi-Document Summarization using Distributed Bag-of-Words Model

Most implemented papers

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

allenai/primer ACL 2022

We introduce PRIMERA, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data.

Proposition-Level Clustering for Multi-Document Summarization

oriern/procluster NAACL 2022

Text clustering methods were traditionally incorporated into multi-document summarization (MDS) as a means for coping with considerable information repetition.

A General Optimization Framework for Multi-Document Summarization Using Genetic Algorithms and Swarm Intelligence

UKPLab/coling2016-genetic-swarm-MDS COLING 2016

Extracting summaries via integer linear programming and submodularity are popular and successful techniques in extractive multi-document summarization.

Bridging the gap between extractive and abstractive summaries: Creation and evaluation of coherent extracts from heterogeneous sources

AIPHES/DBS COLING 2016

Coherent extracts are a novel type of summary combining the advantages of manually created abstractive summaries, which are fluent but difficult to evaluate, and low-quality automatically created extractive summaries, which lack coherence and structure.

The Next Step for Multi-Document Summarization: A Heterogeneous Multi-Genre Corpus Built with a Novel Construction Approach

AIPHES/hMDS COLING 2016

In a detailed analysis, we show that our new corpus is significantly different from the homogeneous corpora commonly used, and that it is heterogeneous along several dimensions.

Bringing Structure into Summaries: Crowdsourcing a Benchmark Corpus of Concept Maps

UKPLab/emnlp2017-cmapsum-corpus EMNLP 2017

Concept maps can be used to concisely represent important information and bring structure into large document collections.