Document Layout Analysis

36 papers with code • 4 benchmarks • 9 datasets

"Document Layout Analysis is performed to determine physical structure of a document, that is, to determine document components. These document components can consist of single connected components-regions [...] of pixels that are adjacent to form single regions [...] , or group of text lines. A text line is a group of characters, symbols, and words that are adjacent, “relatively close” to each other and through which a straight line can be drawn (usually with horizontal or vertical orientation)." L. O'Gorman, "The document spectrum for page layout analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1162-1173, Nov. 1993.

Image credit: PubLayNet: largest dataset ever for document layout analysis

Libraries

Use these libraries to find Document Layout Analysis models and implementations

Subtasks


Text Role Classification in Scientific Charts Using Multimodal Transformers

hjkimk/text-role-classification 8 Feb 2024

The models are evaluated on various chart datasets, and results show that LayoutLMv3 outperforms UDOP in all experiments.

1
08 Feb 2024

Detect-Order-Construct: A Tree Construction based Approach for Hierarchical Document Structure Analysis

microsoft/comphrdoc 22 Jan 2024

Our end-to-end system achieves state-of-the-art performance on two large-scale document layout analysis datasets (PubLayNet and DocLayNet), a high-quality hierarchical document structure reconstruction dataset (HRDoc), and our Comp-HRDoc benchmark.

6
22 Jan 2024

DCQA: Document-Level Chart Question Answering towards Complex Reasoning and Common-Sense Understanding

anranwu-richpo/dcqa 29 Oct 2023

Our DCQA dataset is expected to foster research on understanding visualizations in documents, especially for scenarios that require complex reasoning for charts in the visually-rich document.

0
29 Oct 2023

DocXChain: A Powerful Open-Source Toolchain for Document Parsing and Beyond

alibabaresearch/advancedliteratemachinery 19 Oct 2023

In this report, we introduce DocXChain, a powerful open-source toolchain for document parsing, which is designed and developed to automatically convert the rich information embodied in unstructured documents, such as text, tables and charts, into structured representations that are readable and manipulable by machines.

930
19 Oct 2023

appjsonify: An Academic Paper PDF-to-JSON Conversion Toolkit

hitachi-nlp/appjsonify 2 Oct 2023

We present appjsonify, a Python-based PDF-to-JSON conversion toolkit for academic papers.

38
02 Oct 2023

Vision Grid Transformer for Document Layout Analysis

alibabaresearch/advancedliteratemachinery ICCV 2023

Document pre-trained models and grid-based models have proven to be very effective on various tasks in Document AI.

930
29 Aug 2023

Document AI: A Comparative Study of Transformer-Based, Graph-Based Models, and Convolutional Neural Networks For Document Layout Analysis

samakos/document-ai- 29 Aug 2023

In this study, we aim to fill these gaps by conducting a comparative evaluation of state-of-the-art models in document layout analysis and investigating the potential of cross-lingual layout analysis by utilizing machine translation techniques.

11
29 Aug 2023

A Graphical Approach to Document Layout Analysis

ivanstepanovftw/glam 3 Aug 2023

Document layout analysis (DLA) is the task of detecting the distinct, semantic content within a document and correctly classifying these items into an appropriate category (e. g., text, title, figure).

8
03 Aug 2023

SelfDocSeg: A Self-Supervised vision-based Approach towards Document Segmentation

maitysubhajit/selfdocseg 1 May 2023

Document layout analysis is a known problem to the documents research community and has been vastly explored yielding a multitude of solutions ranging from text mining, and recognition to graph-based representation, visual feature extraction, etc.

29
01 May 2023

PARAGRAPH2GRAPH: A GNN-based framework for layout paragraph analysis

NormXU/Layout2Graph 24 Apr 2023

Document layout analysis has a wide range of requirements across various domains, languages, and business scenarios.

67
24 Apr 2023