Document Layout Analysis

36 papers with code • 4 benchmarks • 9 datasets

"Document Layout Analysis is performed to determine physical structure of a document, that is, to determine document components. These document components can consist of single connected components-regions [...] of pixels that are adjacent to form single regions [...] , or group of text lines. A text line is a group of characters, symbols, and words that are adjacent, “relatively close” to each other and through which a straight line can be drawn (usually with horizontal or vertical orientation)." L. O'Gorman, "The document spectrum for page layout analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1162-1173, Nov. 1993.

Image credit: PubLayNet: largest dataset ever for document layout analysis

Benchmarks

Add a Result

These leaderboards are used to track progress in Document Layout Analysis

Dataset	Best Model	Compare
PubLayNet val	VGT	See all
RVL-CDIP	VisualWordGrid	See all
Document Layout Recognition Challenge test	USYD NLP_CS29-2	See all
Document Layout Recognition Challenge mini-dev	Faster_RCNN	See all

Libraries

Use these libraries to find Document Layout Analysis models and implementations

huggingface/transformers

6 papers

125,425

microsoft/unilm

3 papers

18,378

facebookresearch/data2vec_vision

3 papers

PaddlePaddle/PaddleOCR

2 papers

38,665

See all 8 libraries.

Datasets

Subtasks

MS-SSIM

Latest papers with no code

Most implemented Social Latest No code

Performance Enhancement Leveraging Mask-RCNN on Bengali Document Layout Analysis

no code yet • 21 Aug 2023

We trained a special model called Mask R-CNN to help with this understanding.

Paper
Add Code

Framework and Model Analysis on Bengali Document Layout Analysis Dataset: BaDLAD

no code yet • 15 Aug 2023

We looked at lots of different Bengali documents in our study.

Paper
Add Code

Bridging the Performance Gap between DETR and R-CNN for Graphical Object Detection in Document Images

no code yet • 23 Jun 2023

Upon integrating query modifications in the DETR, we outperform prior works and achieve new state-of-the-art results with the mAP of 96. 9\%, 95. 7\% and 99. 3\% on TableBank, PubLaynet, PubTables, respectively.

Paper
Add Code

Document Layout Annotation: Database and Benchmark in the Domain of Public Affairs

no code yet • 12 Jun 2023

Every day, thousands of digital documents are generated with useful information for companies, public organizations, and citizens.

Paper
Add Code

M$^{6}$Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for Modern Document Layout Analysis

no code yet • 15 May 2023

Document layout analysis is a crucial prerequisite for document understanding, including document retrieval and conversion.

Paper
Add Code

Extracting Complex Named Entities in Legal Documents via Weakly Supervised Object Detection

no code yet • 10 May 2023

Accurate Named Entity Recognition (NER) is crucial for various information retrieval tasks in industry.

Paper
Add Code

Détection d'Objets dans les documents numérisés par réseaux de neurones profonds

no code yet • 27 Jan 2023

For this purpose, we propose confidence estimators from different approaches for object detection.

Paper
Add Code

Efficient few-shot learning for pixel-precise handwritten document layout analysis

no code yet • 27 Oct 2022

Layout analysis is a task of uttermost importance in ancient handwritten document analysis and represents a fundamental step toward the simplification of subsequent tasks such as optical character recognition and automatic transcription.

Paper
Add Code

Transformer-based Approach for Document Understanding

no code yet • IEEE International Conference on Image Processing 2022

We present an end-to-end transformer-based framework named TRDLU for the task of Document Layout Understanding (DLU).

Paper
Add Code

Unified Pretraining Framework for Document Understanding

no code yet • 22 Apr 2022

Document intelligence automates the extraction of information from documents and supports many business applications.

Paper
Add Code

Document Layout Analysis

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result