Key Information Extraction
22 papers with code • 6 benchmarks • 7 datasets
Key Information Extraction (KIE) is aimed at extracting structured information (e.g. key-value pairs) from form-style documents (e.g. invoices), which makes an important step towards intelligent document understanding.
Libraries
Use these libraries to find Key Information Extraction models and implementationsMost implemented papers
LayoutLM: Pre-training of Text and Layout for Document Image Understanding
In this paper, we propose the \textbf{LayoutLM} to jointly model interactions between text and layout information across scanned document images, which is beneficial for a great number of real-world document image understanding tasks such as information extraction from scanned documents.
LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding
Pre-training of text and layout has proved effective in a variety of visually-rich document understanding tasks due to its effective model architecture and the advantage of large-scale unlabeled scanned/digital-born documents.
PICK: Processing Key Information Extraction from Documents using Improved Graph Learning-Convolutional Networks
Computer vision with state-of-the-art deep learning models has achieved huge success in the field of Optical Character Recognition (OCR) including text detection and recognition tasks recently.
Spatial Dual-Modality Graph Reasoning for Key Information Extraction
In order to roundly evaluate our proposed method as well as boost the future research, we release a new dataset named WildReceipt, which is collected and annotated tailored for the evaluation of key information extraction from document images of unseen templates in the wild.
Automatic Metadata Extraction Incorporating Visual Features from Scanned Electronic Theses and Dissertations
Our experiments show that CRF with visual features outperformed both a heuristic and a CRF model with only text-based features.
LiLT: A Simple yet Effective Language-Independent Layout Transformer for Structured Document Understanding
LiLT can be pre-trained on the structured documents of a single language and then directly fine-tuned on other languages with the corresponding off-the-shelf monolingual/multilingual pre-trained textual models.
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
In this paper, we propose \textbf{LayoutLMv3} to pre-train multimodal Transformers for Document AI with unified text and image masking.
ERNIE-Layout: Layout Knowledge Enhanced Pre-training for Visually-rich Document Understanding
Recent years have witnessed the rise and success of pre-training techniques in visually-rich document understanding.
LAMBERT: Layout-Aware (Language) Modeling for information extraction
We introduce a simple new approach to the problem of understanding documents where non-trivial layout influences the local semantics.
ICDAR2019 Competition on Scanned Receipt OCR and Information Extraction
In this competition, we set up three tasks, namely, Scanned Receipt Text Localisation (Task 1), Scanned Receipt OCR (Task 2) and Key Information Extraction from Scanned Receipts (Task 3).