Document Layout Analysis

36 papers with code • 4 benchmarks • 9 datasets

"Document Layout Analysis is performed to determine physical structure of a document, that is, to determine document components. These document components can consist of single connected components-regions [...] of pixels that are adjacent to form single regions [...] , or group of text lines. A text line is a group of characters, symbols, and words that are adjacent, “relatively close” to each other and through which a straight line can be drawn (usually with horizontal or vertical orientation)." L. O'Gorman, "The document spectrum for page layout analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1162-1173, Nov. 1993.

Image credit: PubLayNet: largest dataset ever for document layout analysis

Libraries

Use these libraries to find Document Layout Analysis models and implementations

Subtasks


BaDLAD: A Large Multi-Domain Bengali Document Layout Analysis Dataset

anon-user-for-web/badlad 9 Mar 2023

While strides have been made in deep learning based Bengali Optical Character Recognition (OCR) in the past decade, the absence of large Document Layout Analysis (DLA) datasets has hindered the application of OCR in document transcription, e. g., transcribing historical documents and newspapers.

2
09 Mar 2023

CTE: A Dataset for Contextualized Table Extraction

ailab-unifi/cte-dataset 2 Feb 2023

We define the task of Contextualized Table Extraction (CTE), which aims to extract and define the structure of tables considering the textual context of the document.

15
02 Feb 2023

M6Doc: A Large-Scale Multi-Format, Multi-Type, Multi-Layout, Multi-Language, Multi-Annotation Category Dataset for Modern Document Layout Analysis

hciilab/m6doc CVPR 2023

Document layout analysis is a crucial prerequisite for document understanding, including document retrieval and conversion.

75
01 Jan 2023

Doc2Graph: a Task Agnostic Document Understanding Framework based on Graph Neural Networks

andreagemelli/doc2graph 23 Aug 2022

Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis.

106
23 Aug 2022

Doc-GCN: Heterogeneous Graph Convolutional Networks for Document Layout Analysis

adlnlp/doc_gcn COLING 2022

Recognizing the layout of unstructured digital documents is crucial when parsing the documents into the structured, machine-readable format for downstream applications.

14
22 Aug 2022

DocLayNet: A Large Human-Annotated Dataset for Document-Layout Analysis

DS4SD/DocLayNet 2 Jun 2022

Lastly, we compare models trained on PubLayNet, DocBank and DocLayNet, showing that layout predictions of the DocLayNet-trained models are more robust and thus the preferred choice for general-purpose document-layout analysis.

174
02 Jun 2022

LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking

huggingface/transformers 18 Apr 2022

In this paper, we propose \textbf{LayoutLMv3} to pre-train multimodal Transformers for Document AI with unified text and image masking.

125,425
18 Apr 2022

Towards End-to-End Unified Scene Text Detection and Layout Analysis

tensorflow/models CVPR 2022

In this paper, we bring them together and introduce the task of unified scene text detection and layout analysis.

76,621
28 Mar 2022

DiT: Self-supervised Pre-training for Document Image Transformer

huggingface/transformers 4 Mar 2022

We leverage DiT as the backbone network in a variety of vision-based Document AI tasks, including document image classification, document layout analysis, table detection as well as text detection for OCR.

125,425
04 Mar 2022

DocSegTr: An Instance-Level End-to-End Document Image Segmentation Transformer

biswassanket/docsegtr 27 Jan 2022

has emerged as an interesting problem for the document analysis and understanding community.

46
27 Jan 2022