Document Layout Analysis

36 papers with code • 4 benchmarks • 9 datasets

"Document Layout Analysis is performed to determine physical structure of a document, that is, to determine document components. These document components can consist of single connected components-regions [...] of pixels that are adjacent to form single regions [...] , or group of text lines. A text line is a group of characters, symbols, and words that are adjacent, “relatively close” to each other and through which a straight line can be drawn (usually with horizontal or vertical orientation)." L. O'Gorman, "The document spectrum for page layout analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1162-1173, Nov. 1993.

Image credit: PubLayNet: largest dataset ever for document layout analysis

Benchmarks

Add a Result

These leaderboards are used to track progress in Document Layout Analysis

Dataset	Best Model	Compare
PubLayNet val	VGT	See all
RVL-CDIP	VisualWordGrid	See all
Document Layout Recognition Challenge test	USYD NLP_CS29-2	See all
Document Layout Recognition Challenge mini-dev	Faster_RCNN	See all

Libraries

Use these libraries to find Document Layout Analysis models and implementations

huggingface/transformers

6 papers

124,889

microsoft/unilm

3 papers

18,315

facebookresearch/data2vec_vision

3 papers

PaddlePaddle/PaddleOCR

2 papers

38,418

See all 8 libraries.

Datasets

Subtasks

MS-SSIM

Most implemented papers

Most implemented Social Latest No code

DocBank: A Benchmark Dataset for Document Layout Analysis

doc-analysis/DocBank • COLING 2020

DocBank is constructed using a simple yet effective way with weak supervision from the \LaTeX{} documents available on the arXiv. com.

Paper
Code

Towards End-to-End Unified Scene Text Detection and Layout Analysis

tensorflow/models • • CVPR 2022

In this paper, we bring them together and introduce the task of unified scene text detection and layout analysis.

Paper
Code

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

microsoft/unilm • • 18 Apr 2022

In this paper, we propose \textbf{LayoutLMv3} to pre-train multimodal Transformers for Document AI with unified text and image masking.

Paper
Code

Multi-Task Handwritten Document Layout Analysis

lquirosd/P2PaLA • • 22 Jun 2018

Document Layout Analysis is a fundamental step in Handwritten Text Processing systems, from the extraction of the text lines to the type of zone it belongs to.

Paper
Code

docExtractor: An off-the-shelf historical document element extraction

monniert/docExtractor • • 15 Dec 2020

We present docExtractor, a generic approach for extracting visual elements such as text lines or illustrations from historical documents without requiring any real data annotation.

Paper
Code

VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations

hikopensource/davar-lab-ocr • • 13 May 2021

To address the above limitations, we propose a unified framework VSR for document layout analysis, combining vision, semantics and relations.

Paper
Code

ICDAR 2021 Competition on Historical Map Segmentation

icdar21-mapseg/icdar21-mapseg-eval • 27 May 2021

Task~2 consists in segmenting map content from the larger map sheet, and was won by the UWB team using a U-Net-like FCN combined with a binarization method to increase detection edge accuracy.

Paper
Code