Key Information Extraction

27 papers with code • 6 benchmarks • 10 datasets

Key Information Extraction (KIE) is aimed at extracting structured information (e.g. key-value pairs) from form-style documents (e.g. invoices), which makes an important step towards intelligent document understanding.

Libraries

Use these libraries to find Key Information Extraction models and implementations
5 papers
38,458
2 papers
18,315
2 papers
4,068
See all 6 libraries.

Most implemented papers

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

microsoft/unilm 31 Dec 2019

In this paper, we propose the \textbf{LayoutLM} to jointly model interactions between text and layout information across scanned document images, which is beneficial for a great number of real-world document image understanding tasks such as information extraction from scanned documents.

LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding

microsoft/unilm ACL 2021

Pre-training of text and layout has proved effective in a variety of visually-rich document understanding tasks due to its effective model architecture and the advantage of large-scale unlabeled scanned/digital-born documents.

PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks

wenwenyu/PICK-pytorch 16 Apr 2020

Computer vision with state-of-the-art deep learning models has achieved huge success in the field of Optical Character Recognition (OCR) including text detection and recognition tasks recently.

Spatial Dual-Modality Graph Reasoning for Key Information Extraction

open-mmlab/mmocr 26 Mar 2021

In order to roundly evaluate our proposed method as well as boost the future research, we release a new dataset named WildReceipt, which is collected and annotated tailored for the evaluation of key information extraction from document images of unseen templates in the wild.

Automatic Metadata Extraction Incorporating Visual Features from Scanned Electronic Theses and Dissertations

lamps-lab/AutoMeta 1 Jul 2021

Our experiments show that CRF with visual features outperformed both a heuristic and a CRF model with only text-based features.

MMOCR: A Comprehensive Toolbox for Text Detection, Recognition and Understanding

open-mmlab/mmocr 14 Aug 2021

We present MMOCR-an open-source toolbox which provides a comprehensive pipeline for text detection and recognition, as well as their downstream tasks such as named entity recognition and key information extraction.

LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding

jpwang/lilt ACL 2022

LiLT can be pre-trained on the structured documents of a single language and then directly fine-tuned on other languages with the corresponding off-the-shelf monolingual/multilingual pre-trained textual models.

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

microsoft/unilm 18 Apr 2022

In this paper, we propose \textbf{LayoutLMv3} to pre-train multimodal Transformers for Document AI with unified text and image masking.

ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding

PaddlePaddle/PaddleNLP 12 Oct 2022

Recent years have witnessed the rise and success of pre-training techniques in visually-rich document understanding.

LAMBERT: Layout-Aware (Language) Modeling for information extraction

applicaai/lambert 19 Feb 2020

We introduce a simple new approach to the problem of understanding documents where non-trivial layout influences the local semantics.