Multi-Document Summarization

93 papers with code • 5 benchmarks • 15 datasets

Multi-Document Summarization is a process of representing a set of documents with a short piece of text by capturing the relevant information and filtering out the redundant information. Two prominent approaches to Multi-Document Summarization are extractive and abstractive summarization. Extractive summarization systems aim to extract salient snippets, sentences or passages from documents, while abstractive summarization systems aim to concisely paraphrase the content of the documents.

Source: Multi-Document Summarization using Distributed Bag-of-Words Model

PeerSum: A Peer Review Dataset for Abstractive Multi-document Summarization

oaimli/peersum 3 Mar 2022

We present PeerSum, a new MDS dataset using peer reviews of scientific publications.

12
03 Mar 2022

Proposition-Level Clustering for Multi-Document Summarization

oriern/procluster ACL ARR January 2022

Text clustering methods were traditionally incorporated into multi-document summarization (MDS) as a means for coping with considerable information repetition.

10
16 Jan 2022

Proposition-Level Clustering for Multi-Document Summarization

oriern/clusterprop NAACL 2022

Text clustering methods were traditionally incorporated into multi-document summarization (MDS) as a means for coping with considerable information repetition.

10
16 Dec 2021

LongT5: Efficient Text-To-Text Transformer for Long Sequences

google-research/longt5 Findings (NAACL) 2022

Recent work has shown that either (1) increasing the input length or (2) increasing model size can improve the performance of Transformer-based neural models.

169
15 Dec 2021

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

allenai/primer ACL ARR November 2021

We introduce PRIMERA, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data.

148
16 Nov 2021

SgSum: Transforming Multi-document Summarization into Sub-graph Selection

PaddlePaddle/Research 25 Oct 2021

Comparing with traditional methods, our method has two main advantages: (1) the relations between sentences are captured by modeling both the graph structure of the whole document set and the candidate sub-graphs; (2) directly outputs an integrate summary in the form of sub-graph which is more informative and coherent.

1,694
25 Oct 2021

PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

allenai/primer ACL 2022

We introduce PRIMERA, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data.

148
16 Oct 2021

Modeling Endorsement for Multi-Document Abstractive Summarization

ucfnlp/endorser-summ EMNLP (newsum) 2021

In this paper, we model the cross-document endorsement effect and its utilization in multiple document summarization.

1
15 Oct 2021

HETFORMER: Heterogeneous Transformer with Sparse Attention for Long-Text Extractive Summarization

yeliu918/hetformer EMNLP 2021

To capture the semantic graph structure from raw text, most existing summarization approaches are built on GNNs with a pre-trained model.

14
12 Oct 2021

Extending Multi-Text Sentence Fusion Resources via Pyramid Annotations

danielabweiss/extending-sentence-fusion-resources NAACL 2022

NLP models that compare or consolidate information across multiple documents often struggle when challenged with recognizing substantial information redundancies across the texts.

2
09 Oct 2021