TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Key Information Extraction	CORD	LILT	F1	96.07	# 5
Semantic entity labeling	FUNSD	LILT	F1	88.41	# 9
Document Image Classification	RVL-CDIP	LiLT[EN-R]BASE	Accuracy	95.68%	# 5

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/lilt-a-simple-yet-effective-language/key-information-extraction-on-cord)](https://paperswithcode.com/sota/key-information-extraction-on-cord?p=lilt-a-simple-yet-effective-language)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/lilt-a-simple-yet-effective-language/document-image-classification-on-rvl-cdip)](https://paperswithcode.com/sota/document-image-classification-on-rvl-cdip?p=lilt-a-simple-yet-effective-language)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/lilt-a-simple-yet-effective-language/semantic-entity-labeling-on-funsd)](https://paperswithcode.com/sota/semantic-entity-labeling-on-funsd?p=lilt-a-simple-yet-effective-language)`

LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding

ACL 2022 · Jiapeng Wang, Lianwen Jin, Kai Ding ·

Structured document understanding has attracted considerable attention and made significant progress recently, owing to its crucial role in intelligent document processing. However, most existing related models can only deal with the document data of specific language(s) (typically English) included in the pre-training collection, which is extremely limited. To address this issue, we propose a simple yet effective Language-independent Layout Transformer (LiLT) for structured document understanding. LiLT can be pre-trained on the structured documents of a single language and then directly fine-tuned on other languages with the corresponding off-the-shelf monolingual/multilingual pre-trained textual models. Experimental results on eight languages have shown that LiLT can achieve competitive or even superior performance on diverse widely-used downstream benchmarks, which enables language-independent benefit from the pre-training of document layout structure. Code and model are publicly available at https://github.com/jpWang/LiLT.

PDF Abstract ACL 2022 PDF ACL 2022 Abstract

Code

Add Remove Mark official

jpwang/lilt official

318

huggingface/transformers

124,889

Tasks

Add Remove

Document Image Classification

document understanding

Key Information Extraction

Semantic entity labeling

Datasets

FUNSD

RVL-CDIP CORD XFUND

EPHOIE

Results from the Paper

Edit

Ranked #5 on Key Information Extraction on CORD

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Key Information Extraction	CORD	LILT	F1	96.07	# 5	Compare
Semantic entity labeling	FUNSD	LILT	F1	88.41	# 9	Compare
Document Image Classification	RVL-CDIP	LiLT[EN-R]BASE	Accuracy	95.68%	# 5	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer

Edit Social Preview

LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove