PRIMERA: Pyramid-based Masked Sentence Pre-training for Multi-document Summarization

We introduce PRIMERA, a pre-trained model for multi-document representation with a focus on summarization that reduces the need for dataset-specific architectures and large amounts of fine-tuning labeled data. PRIMERA uses our newly proposed pre-training objective designed to teach the model to connect and aggregate information across documents. It also uses efficient encoder-decoder transformers to simplify the processing of concatenated input documents. With extensive experiments on 6 multi-document summarization datasets from 3 different domains on zero-shot, few-shot and full-supervised settings, PRIMERA outperforms current state-of-the-art dataset-specific and pre-trained models on most of these settings with large margins. The code and pre-trained models can be found at \url{}.

PDF Abstract ACL 2022 PDF ACL 2022 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Text Summarization arXiv Summarization Dataset PRIMER ROUGE-1 47.6 # 1
ROUGE-2 20.8 # 1
ROUGE-L 42.6 # 1
Multi-Document Summarization Multi-News PRIMER ROUGE-2 21.1 # 1
ROUGE-1 49.9 # 1
ROUGE-L 25.9 # 1
Multi-Document Summarization WCEP PRIMER ROUGE-1 46.1 # 1
ROUGE-2 25.2 # 1
ROUGE-L 37.9 # 1