Multi-Document Summarization

93 papers with code • 5 benchmarks • 15 datasets

Multi-Document Summarization is a process of representing a set of documents with a short piece of text by capturing the relevant information and filtering out the redundant information. Two prominent approaches to Multi-Document Summarization are extractive and abstractive summarization. Extractive summarization systems aim to extract salient snippets, sentences or passages from documents, while abstractive summarization systems aim to concisely paraphrase the content of the documents.

Source: Multi-Document Summarization using Distributed Bag-of-Words Model

Benchmarks

Add a Result

These leaderboards are used to track progress in Multi-Document Summarization

Dataset	Best Model	Compare
Multi-News	PRIMER	See all
DUC 2004	GCN: Personalized Discourse Graph	See all
review	solar	See all
WCEP	PRIMER	See all
MS^2	led-base-16384-ms2	See all

Datasets

Latest papers

Most implemented Social Latest No code

XWikiGen: Cross-lingual Summarization for Encyclopedic Text Generation in Low Resource Languages

DhavalTaunk08/XWikiGen • 22 Mar 2023

But, for low-resource languages, the scarcity of reference articles makes monolingual summarization ineffective in solving this problem.

22 Mar 2023

Paper
Code

Compressed Heterogeneous Graph for Abstractive Multi-Document Summarization

oaimli/hgsum • • 12 Mar 2023

We propose HGSUM, an MDS model that extends an encoder-decoder architecture, to incorporate a heterogeneous graph to represent different semantic units (e. g., words and sentences) of the documents.

12 Mar 2023

Paper
Code

PDSum: Prototype-driven Continuous Summarization of Evolving Multi-document Sets Stream

cliveyn/pdsum • • 10 Feb 2023

Summarizing text-rich documents has been long studied in the literature, but most of the existing efforts have been made to summarize a static and predefined multi-document set.

10 Feb 2023

Paper
Code

Generating a Structured Summary of Numerous Academic Papers: Dataset and Method

stevenlau6/bigsurvey • 9 Feb 2023

Existing MDS datasets usually focus on producing the structureless summary covering a few input documents.

09 Feb 2023

Paper
Code

SumREN: Summarizing Reported Speech about Events in News

amazon-science/SumREN • 2 Dec 2022

A primary objective of news articles is to establish the factual record for an event, frequently achieved by conveying both the details of the specified event (i. e., the 5 Ws; Who, What, Where, When and Why regarding the event) and how people reacted to it (i. e., reported statements).

02 Dec 2022

Paper
Code

How "Multi" is Multi-Document Summarization?

ariecattan/multi_mds • • 23 Oct 2022

To that end, we propose an automated measure for evaluating the degree to which a summary is ``disperse'', in the sense of the number of source documents needed to cover its content.

23 Oct 2022

Paper
Code

Multi-Document Summarization with Centroid-Based Pretraining

ratishsp/centrum • • 1 Aug 2022

In Multi-Document Summarization (MDS), the input can be modeled as a set of documents, and the output is its summary.

01 Aug 2022

Paper
Code

Multi-LexSum: Real-World Summaries of Civil Rights Lawsuits at Multiple Granularities

multilexsum/dataset • 22 Jun 2022

With the advent of large language models, methods for abstractive summarization have made great strides, creating potential for use in applications to aid knowledge workers processing unwieldy document collections.

22 Jun 2022

Paper
Code

Improving Multi-Document Summarization through Referenced Flexible Extraction with Credit-Awareness

yunzhusong/NAACL2022-REFLECT • • NAACL 2022

A notable challenge in Multi-Document Summarization (MDS) is the extremely-long length of the input.

04 May 2022

Paper
Code

A Multi-Document Coverage Reward for RELAXed Multi-Document Summarization

jacob-parnell-rozetta/longformer_coverage • • ACL 2022

Multi-document summarization (MDS) has made significant progress in recent years, in part facilitated by the availability of new, dedicated datasets and capacious language models.

06 Mar 2022

Paper
Code

Multi-Document Summarization

Benchmarks Add a Result

Datasets

Latest papers

Content

Benchmarks

Add a Result