TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Named Entity Recognition (NER)	CORD-r	TPP (LayoutMask)	F1	89.34	# 2
Named Entity Recognition (NER)	CORD-r	TPP (LayoutLMv3)	F1	91.85	# 1
Relation Extraction	FUNSD	TPP (LayoutMask)	F1	79.20	# 4
Entity Linking	FUNSD	TPP (LayoutMask)	F1	79.20	# 1
Named Entity Recognition (NER)	FUNSD-r	TPP (LayoutLMv3)	F1	80.40	# 1
Named Entity Recognition (NER)	FUNSD-r	TPP (LayoutMask)	F1	78.19	# 3
Reading Order Detection	ReadingBank	TPP (LayoutMask)	Average Page-level BLEU	98.16	# 2
Reading Order Detection	ReadingBank	TPP (LayoutMask)	Average Relative Distance (ARD)	0.37	# 1

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/reading-order-matters-information-extraction/named-entity-recognition-ner-on-cord-r)](https://paperswithcode.com/sota/named-entity-recognition-ner-on-cord-r?p=reading-order-matters-information-extraction)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/reading-order-matters-information-extraction/entity-linking-on-funsd)](https://paperswithcode.com/sota/entity-linking-on-funsd?p=reading-order-matters-information-extraction)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/reading-order-matters-information-extraction/named-entity-recognition-ner-on-funsd-r)](https://paperswithcode.com/sota/named-entity-recognition-ner-on-funsd-r?p=reading-order-matters-information-extraction)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/reading-order-matters-information-extraction/reading-order-detection-on-readingbank)](https://paperswithcode.com/sota/reading-order-detection-on-readingbank?p=reading-order-matters-information-extraction)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/reading-order-matters-information-extraction/relation-extraction-on-funsd)](https://paperswithcode.com/sota/relation-extraction-on-funsd?p=reading-order-matters-information-extraction)`

Reading Order Matters: Information Extraction from Visually-rich Documents by Token Path Prediction

17 Oct 2023 · Chong Zhang, Ya Guo, Yi Tu, Huan Chen, Jinyang Tang, Huijia Zhu, Qi Zhang, Tao Gui ·

Recent advances in multimodal pre-trained models have significantly improved information extraction from visually-rich documents (VrDs), in which named entity recognition (NER) is treated as a sequence-labeling task of predicting the BIO entity tags for tokens, following the typical setting of NLP. However, BIO-tagging scheme relies on the correct order of model inputs, which is not guaranteed in real-world NER on scanned VrDs where text are recognized and arranged by OCR systems. Such reading order issue hinders the accurate marking of entities by BIO-tagging scheme, making it impossible for sequence-labeling methods to predict correct named entities. To address the reading order issue, we introduce Token Path Prediction (TPP), a simple prediction head to predict entity mentions as token sequences within documents. Alternative to token classification, TPP models the document layout as a complete directed graph of tokens, and predicts token paths within the graph as entities. For better evaluation of VrD-NER systems, we also propose two revised benchmark datasets of NER on scanned documents which can reflect real-world scenarios. Experiment results demonstrate the effectiveness of our method, and suggest its potential to be a universal solution to various information extraction tasks on documents.

PDF Abstract

Code

Add Remove Mark official

chongzhangfdu/tpp official

Tasks

Add Remove

Entity Linking

Key Information Extraction

named-entity-recognition

Named Entity Recognition

Named Entity Recognition (NER)

NER

Reading Order Detection

Relation Extraction

Semantic entity labeling

Sentence Ordering

token-classification

Token Classification

Datasets

FUNSD CORD

ReadingBank

FUNSD-r

CORD-r

Results from the Paper

Edit

Ranked #1 on Entity Linking on FUNSD

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Named Entity Recognition (NER)	CORD-r	TPP (LayoutMask)	F1	89.34	# 2	Compare
Named Entity Recognition (NER)	CORD-r	TPP (LayoutLMv3)	F1	91.85	# 1	Compare
Relation Extraction	FUNSD	TPP (LayoutMask)	F1	79.20	# 4	Compare
Entity Linking	FUNSD	TPP (LayoutMask)	F1	79.20	# 1	Compare
Named Entity Recognition (NER)	FUNSD-r	TPP (LayoutLMv3)	F1	80.40	# 1	Compare
Named Entity Recognition (NER)	FUNSD-r	TPP (LayoutMask)	F1	78.19	# 3	Compare
Reading Order Detection	ReadingBank	TPP (LayoutMask)	Average Page-level BLEU	98.16	# 2	Compare
Reading Order Detection	ReadingBank	TPP (LayoutMask)	Average Relative Distance (ARD)	0.37	# 1	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Reading Order Matters: Information Extraction from Visually-rich Documents by Token Path Prediction

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove