Document Layout Analysis

36 papers with code • 4 benchmarks • 9 datasets

"Document Layout Analysis is performed to determine physical structure of a document, that is, to determine document components. These document components can consist of single connected components-regions [...] of pixels that are adjacent to form single regions [...] , or group of text lines. A text line is a group of characters, symbols, and words that are adjacent, “relatively close” to each other and through which a straight line can be drawn (usually with horizontal or vertical orientation)." L. O'Gorman, "The document spectrum for page layout analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1162-1173, Nov. 1993.

Image credit: PubLayNet: largest dataset ever for document layout analysis

Benchmarks

Add a Result

These leaderboards are used to track progress in Document Layout Analysis

Dataset	Best Model	Compare
PubLayNet val	VGT	See all
RVL-CDIP	VisualWordGrid	See all
Document Layout Recognition Challenge test	USYD NLP_CS29-2	See all
Document Layout Recognition Challenge mini-dev	Faster_RCNN	See all

Libraries

Use these libraries to find Document Layout Analysis models and implementations

huggingface/transformers

6 papers

124,984

microsoft/unilm

3 papers

18,327

facebookresearch/data2vec_vision

3 papers

PaddlePaddle/PaddleOCR

2 papers

38,458

See all 8 libraries.

Datasets

Subtasks

MS-SSIM

Latest papers with no code

Most implemented Social Latest No code

RoDLA: Benchmarking the Robustness of Document Layout Analysis Models

no code yet • 21 Mar 2024

To address this, we are the first to introduce a robustness benchmark for DLA models, which includes 450K document images of three datasets.

Paper
Add Code

AutoIE: An Automated Framework for Information Extraction from Scientific Literature

no code yet • 30 Jan 2024

In the rapidly evolving field of scientific research, efficiently extracting key information from the burgeoning volume of scientific papers remains a formidable challenge.

Paper
Add Code

U-DIADS-Bib: a full and few-shot pixel-precise dataset for document layout analysis of ancient manuscripts

no code yet • 16 Jan 2024

Document Layout Analysis, which is the task of identifying different semantic regions inside of a document page, is a subject of great interest for both computer scientists and humanities scholars as it represents a fundamental step towards further analysis tasks for the former and a powerful tool to improve and facilitate the study of the documents for the latter.

Paper
Add Code

Object Recognition from Scientific Document based on Compartment Refinement Framework

no code yet • 14 Dec 2023

The lack of a comprehensive definition of the internal structure and elements of the documents indirectly impacts the accuracy of text classification and object recognition tasks.

Paper
Add Code