TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Text Summarization	Arxiv HEP-TH citation graph	HiStruct+	ROUGE-1	45.22	# 15
Text Summarization	Arxiv HEP-TH citation graph	HiStruct+	ROUGE-2	17.67	# 15
Text Summarization	Arxiv HEP-TH citation graph	HiStruct+	ROUGE-L	40.16	# 13
Text Summarization	Pubmed	HiStruct+	ROUGE-1	46.59	# 12
Text Summarization	Pubmed	HiStruct+	ROUGE-2	20.39	# 12
Text Summarization	Pubmed	HiStruct+	ROUGE-L	42.11	# 11

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/histruct-improving-extractive-text-1/text-summarization-on-pubmed-1)](https://paperswithcode.com/sota/text-summarization-on-pubmed-1?p=histruct-improving-extractive-text-1)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/histruct-improving-extractive-text-1/text-summarization-on-arxiv)](https://paperswithcode.com/sota/text-summarization-on-arxiv?p=histruct-improving-extractive-text-1)`

HiStruct+: Improving Extractive Text Summarization with Hierarchical Structure Information

Findings (ACL) 2022 · Qian Ruan, Malte Ostendorff, Georg Rehm ·

Transformer-based language models usually treat texts as linear sequences. However, most texts also have an inherent hierarchical structure, i.e., parts of a text can be identified using their position in this hierarchy. In addition, section titles usually indicate the common topic of their respective sentences. We propose a novel approach to formulate, extract, encode and inject hierarchical structure information explicitly into an extractive summarization model based on a pre-trained, encoder-only Transformer language model (HiStruct+ model), which improves SOTA ROUGEs for extractive summarization on PubMed and arXiv substantially. Using various experimental settings on three datasets (i.e., CNN/DailyMail, PubMed and arXiv), our HiStruct+ model outperforms a strong baseline collectively, which differs from our model only in that the hierarchical structure information is not injected. It is also observed that the more conspicuous hierarchical structure the dataset has, the larger improvements our method gains. The ablation study demonstrates that the hierarchical position information is the main contributor to our model's SOTA performance.

PDF Abstract Findings (ACL) 2022 PDF Findings (ACL) 2022 Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Extractive Summarization

Extractive Text Summarization

Language Modelling

Position

Text Summarization

Datasets

Pubmed Arxiv HEP-TH citation graph

Results from the Paper

Edit

Ranked #12 on Text Summarization on Pubmed

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Text Summarization	Arxiv HEP-TH citation graph	HiStruct+	ROUGE-1	45.22	# 15	Compare
			ROUGE-2	17.67	# 15	Compare
			ROUGE-L	40.16	# 13	Compare
Text Summarization	Pubmed	HiStruct+	ROUGE-1	46.59	# 12	Compare
			ROUGE-2	20.39	# 12	Compare
			ROUGE-L	42.11	# 11	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

HiStruct+: Improving Extractive Text Summarization with Hierarchical Structure Information

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove