Key Information Extraction

27 papers with code • 6 benchmarks • 10 datasets

Key Information Extraction (KIE) is aimed at extracting structured information (e.g. key-value pairs) from form-style documents (e.g. invoices), which makes an important step towards intelligent document understanding.

Benchmarks

Add a Result

These leaderboards are used to track progress in Key Information Extraction

Dataset	Best Model	Compare
CORD	GeoLayoutLM	See all
Kleister NDA	LayoutLMv2LARGE	See all
SROIE	LayoutLMv2LARGE (Excluding OCR mismatch)	See all
EPHOIE	LayoutLMv3	See all
ETD500	CRF-visual	See all
SIMARA	DAN	See all

Libraries

Use these libraries to find Key Information Extraction models and implementations

PaddlePaddle/PaddleOCR

5 papers

38,458

huggingface/transformers

4 papers

124,889

microsoft/unilm

2 papers

18,315

open-mmlab/mmocr

2 papers

4,068

See all 6 libraries.

Datasets

Most implemented papers

Most implemented Social Latest No code

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

microsoft/unilm • • 31 Dec 2019

In this paper, we propose the \textbf{LayoutLM} to jointly model interactions between text and layout information across scanned document images, which is beneficial for a great number of real-world document image understanding tasks such as information extraction from scanned documents.

Paper
Code

LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding

microsoft/unilm • • ACL 2021

Pre-training of text and layout has proved effective in a variety of visually-rich document understanding tasks due to its effective model architecture and the advantage of large-scale unlabeled scanned/digital-born documents.

Paper
Code

PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks

wenwenyu/PICK-pytorch • • 16 Apr 2020

Computer vision with state-of-the-art deep learning models has achieved huge success in the field of Optical Character Recognition (OCR) including text detection and recognition tasks recently.

Paper
Code

Spatial Dual-Modality Graph Reasoning for Key Information Extraction

open-mmlab/mmocr • • 26 Mar 2021

In order to roundly evaluate our proposed method as well as boost the future research, we release a new dataset named WildReceipt, which is collected and annotated tailored for the evaluation of key information extraction from document images of unseen templates in the wild.

Paper
Code

Automatic Metadata Extraction Incorporating Visual Features from Scanned Electronic Theses and Dissertations

lamps-lab/AutoMeta • 1 Jul 2021

Our experiments show that CRF with visual features outperformed both a heuristic and a CRF model with only text-based features.

Paper
Code

MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding

open-mmlab/mmocr • • 14 Aug 2021

We present MMOCR-an open-source toolbox which provides a comprehensive pipeline for text detection and recognition, as well as their downstream tasks such as named entity recognition and key information extraction.

Paper
Code

LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding

jpwang/lilt • • ACL 2022

LiLT can be pre-trained on the structured documents of a single language and then directly fine-tuned on other languages with the corresponding off-the-shelf monolingual/multilingual pre-trained textual models.

Paper
Code