Document Layout Analysis

16 papers with code • 4 benchmarks • 7 datasets

"Document Layout Analysis is performed to determine physical structure of a document, that is, to determine document components. These document components can consist of single connected components-regions [...] of pixels that are adjacent to form single regions [...] , or group of text lines. A text line is a group of characters, symbols, and words that are adjacent, “relatively close” to each other and through which a straight line can be drawn (usually with horizontal or vertical orientation)." L. O'Gorman, "The document spectrum for page layout analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1162-1173, Nov. 1993.

Image credit: PubLayNet: largest dataset ever for document layout analysis

Libraries

Use these libraries to find Document Layout Analysis models and implementations
3 papers
5,783
2 papers
22,920
See all 6 libraries.

Subtasks


Most implemented papers

Training data-efficient image transformers & distillation through attention

facebookresearch/deit 23 Dec 2020

In this work, we produce a competitive convolution-free transformer by training on Imagenet only.

LayoutLM: Pre-training of Text and Layout for Document Image Understanding

microsoft/unilm 31 Dec 2019

In this paper, we propose the \textbf{LayoutLM} to jointly model interactions between text and layout information across scanned document images, which is beneficial for a great number of real-world document image understanding tasks such as information extraction from scanned documents.

BEiT: BERT Pre-Training of Image Transformers

microsoft/unilm ICLR 2022

We first "tokenize" the original image into visual tokens.

PubLayNet: largest dataset ever for document layout analysis

ibm-aur-nlp/PubLayNet 16 Aug 2019

Deep neural networks that are developed for computer vision have been proven to be an effective method to analyze layout of document images.

LayoutLMv2: Multi-modal Pre-training for Visually-Rich Document Understanding

microsoft/unilm ACL 2021

Pre-training of text and layout has proved effective in a variety of visually-rich document understanding tasks due to its effective model architecture and the advantage of large-scale unlabeled scanned/digital-born documents.

Combining Visual and Textual Features for Semantic Segmentation of Historical Newspapers

dhlab-epfl/dhSegment-text 14 Feb 2020

The massive amounts of digitized historical documents acquired over the last decades naturally lend themselves to automatic processing and exploration.

A Large Dataset of Historical Japanese Documents with Complex Layouts

dell-research-harvard/HJDataset 18 Apr 2020

Deep learning-based approaches for automatic document layout analysis and content extraction have the potential to unlock rich information trapped in historical documents on a large scale.

DiT: Self-supervised Pre-training for Document Image Transformer

microsoft/unilm 4 Mar 2022

Image Transformer has recently achieved significant progress for natural image understanding, either using supervised (ViT, DeiT, etc.)

Towards End-to-End Unified Scene Text Detection and Layout Analysis

tensorflow/models CVPR 2022

In this paper, we bring them together and introduce the task of unified scene text detection and layout analysis.

Multi-Task Handwritten Document Layout Analysis

lquirosd/P2PaLA 22 Jun 2018

Document Layout Analysis is a fundamental step in Handwritten Text Processing systems, from the extraction of the text lines to the type of zone it belongs to.