Document Layout Analysis

36 papers with code • 4 benchmarks • 9 datasets

"Document Layout Analysis is performed to determine physical structure of a document, that is, to determine document components. These document components can consist of single connected components-regions [...] of pixels that are adjacent to form single regions [...] , or group of text lines. A text line is a group of characters, symbols, and words that are adjacent, “relatively close” to each other and through which a straight line can be drawn (usually with horizontal or vertical orientation)." L. O'Gorman, "The document spectrum for page layout analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1162-1173, Nov. 1993.

Image credit: PubLayNet: largest dataset ever for document layout analysis

Benchmarks

Add a Result

These leaderboards are used to track progress in Document Layout Analysis

Dataset	Best Model	Compare
PubLayNet val	VGT	See all
RVL-CDIP	VisualWordGrid	See all
Document Layout Recognition Challenge test	USYD NLP_CS29-2	See all
Document Layout Recognition Challenge mini-dev	Faster_RCNN	See all

Libraries

Use these libraries to find Document Layout Analysis models and implementations

huggingface/transformers

6 papers

124,984

microsoft/unilm

3 papers

18,327

facebookresearch/data2vec_vision

3 papers

PaddlePaddle/PaddleOCR

2 papers

38,458

See all 8 libraries.

Datasets

Subtasks

MS-SSIM

Latest papers

Most implemented Social Latest No code

Text Role Classification in Scientific Charts Using Multimodal Transformers

hjkimk/text-role-classification • • 8 Feb 2024

The models are evaluated on various chart datasets, and results show that LayoutLMv3 outperforms UDOP in all experiments.

08 Feb 2024

Paper
Code

Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis

microsoft/comphrdoc • 22 Jan 2024

Our end-to-end system achieves state-of-the-art performance on two large-scale document layout analysis datasets (PubLayNet and DocLayNet), a high-quality hierarchical document structure reconstruction dataset (HRDoc), and our Comp-HRDoc benchmark.

22 Jan 2024

Paper
Code

DCQA: Document-Level Chart Question Answering towards Complex Reasoning and Common-Sense Understanding

anranwu-richpo/dcqa • 29 Oct 2023

Our DCQA dataset is expected to foster research on understanding visualizations in documents, especially for scenarios that require complex reasoning for charts in the visually-rich document.

29 Oct 2023

Paper
Code

DocXChain: A Powerful Open-Source Toolchain for Document Parsing and Beyond

alibabaresearch/advancedliteratemachinery • • 19 Oct 2023

In this report, we introduce DocXChain, a powerful open-source toolchain for document parsing, which is designed and developed to automatically convert the rich information embodied in unstructured documents, such as text, tables and charts, into structured representations that are readable and manipulable by machines.

930

19 Oct 2023

Paper
Code

appjsonify: An Academic Paper PDF-to-JSON Conversion Toolkit

hitachi-nlp/appjsonify • • 2 Oct 2023

We present appjsonify, a Python-based PDF-to-JSON conversion toolkit for academic papers.

02 Oct 2023

Paper
Code

Vision Grid Transformer for Document Layout Analysis

alibabaresearch/advancedliteratemachinery • • ICCV 2023

Document pre-trained models and grid-based models have proven to be very effective on various tasks in Document AI.

930

29 Aug 2023

Paper
Code

Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis

samakos/document-ai- • 29 Aug 2023

In this study, we aim to fill these gaps by conducting a comparative evaluation of state-of-the-art models in document layout analysis and investigating the potential of cross-lingual layout analysis by utilizing machine translation techniques.

29 Aug 2023

Paper
Code

A Graphical Approach to Document Layout Analysis

ivanstepanovftw/glam • • 3 Aug 2023

Document layout analysis (DLA) is the task of detecting the distinct, semantic content within a document and correctly classifying these items into an appropriate category (e. g., text, title, figure).

03 Aug 2023

Paper
Code

SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation

maitysubhajit/selfdocseg • • 1 May 2023

Document layout analysis is a known problem to the documents research community and has been vastly explored yielding a multitude of solutions ranging from text mining, and recognition to graph-based representation, visual feature extraction, etc.

01 May 2023

Paper
Code

PARAGRAPH2GRAPH: A GNN-based framework for layout paragraph analysis

NormXU/Layout2Graph • • 24 Apr 2023

Document layout analysis has a wide range of requirements across various domains, languages, and business scenarios.

24 Apr 2023

Paper
Code

Document Layout Analysis

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result