Document Layout Analysis

36 papers with code • 4 benchmarks • 9 datasets

"Document Layout Analysis is performed to determine physical structure of a document, that is, to determine document components. These document components can consist of single connected components-regions [...] of pixels that are adjacent to form single regions [...] , or group of text lines. A text line is a group of characters, symbols, and words that are adjacent, “relatively close” to each other and through which a straight line can be drawn (usually with horizontal or vertical orientation)." L. O'Gorman, "The document spectrum for page layout analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1162-1173, Nov. 1993.

Image credit: PubLayNet: largest dataset ever for document layout analysis

Libraries

Use these libraries to find Document Layout Analysis models and implementations

Subtasks


Most implemented papers

DocBank: A Benchmark Dataset for Document Layout Analysis

doc-analysis/DocBank COLING 2020

DocBank is constructed using a simple yet effective way with weak supervision from the \LaTeX{} documents available on the arXiv. com.

Towards End-to-End Unified Scene Text Detection and Layout Analysis

tensorflow/models CVPR 2022

In this paper, we bring them together and introduce the task of unified scene text detection and layout analysis.

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

microsoft/unilm 18 Apr 2022

In this paper, we propose \textbf{LayoutLMv3} to pre-train multimodal Transformers for Document AI with unified text and image masking.

Multi-Task Handwritten Document Layout Analysis

lquirosd/P2PaLA 22 Jun 2018

Document Layout Analysis is a fundamental step in Handwritten Text Processing systems, from the extraction of the text lines to the type of zone it belongs to.

docExtractor: An off-the-shelf historical document element extraction

monniert/docExtractor 15 Dec 2020

We present docExtractor, a generic approach for extracting visual elements such as text lines or illustrations from historical documents without requiring any real data annotation.

VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations

hikopensource/davar-lab-ocr 13 May 2021

To address the above limitations, we propose a unified framework VSR for document layout analysis, combining vision, semantics and relations.

ICDAR 2021 Competition on Historical Map Segmentation

icdar21-mapseg/icdar21-mapseg-eval 27 May 2021

Task~2 consists in segmenting map content from the larger map sheet, and was won by the UWB team using a U-Net-like FCN combined with a binarization method to increase detection edge accuracy.

DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis

biswassanket/synth_doc_generation 6 Jul 2021

The results highlight that our model can successfully generate realistic and diverse document images with multiple objects.

LayoutReader: Pre-training of Text and Layout for Reading Order Detection

microsoft/unilm EMNLP 2021

Reading order detection is the cornerstone to understanding visually-rich documents (e. g., receipts and forms).

DocSegTr: An Instance-Level End-to-End Document Image Segmentation Transformer

biswassanket/docsegtr 27 Jan 2022

has emerged as an interesting problem for the document analysis and understanding community.