On Extractive and Abstractive Neural Document Summarization with Transformer Language Models

We present a method to produce abstractive summaries of long documents that exceed several thousand words via neural abstractive summarization. We perform a simple extractive step before generating a summary, which is then used to condition the transformer language model on relevant information before being tasked with generating a summary. We show that this extractive step significantly improves summarization results. We also show that this approach produces more abstractive summaries compared to prior work that employs a copy mechanism while still achieving higher rouge scores. Note: The abstract above was not written by the authors, it was generated by one of the models presented in this paper.

PDF Abstract EMNLP 2020 PDF EMNLP 2020 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Text Summarization Arxiv HEP-TH citation graph Sent-PTR ROUGE-1 42.32 # 24
Text Summarization Arxiv HEP-TH citation graph Sent-CLF ROUGE-1 34.01 # 28
Text Summarization Arxiv HEP-TH citation graph TLM-I+E ROUGE-1 42.43 # 23
Text Summarization Pubmed Sent-CLF ROUGE-1 45.01 # 18
Text Summarization Pubmed Sent-PTR ROUGE-1 43.3 # 22
Text Summarization Pubmed TLM-I+E ROUGE-1 41.43 # 24

Methods