Document Layout Analysis
36 papers with code • 4 benchmarks • 9 datasets
"Document Layout Analysis is performed to determine physical structure of a document, that is, to determine document components. These document components can consist of single connected components-regions [...] of pixels that are adjacent to form single regions [...] , or group of text lines. A text line is a group of characters, symbols, and words that are adjacent, “relatively close” to each other and through which a straight line can be drawn (usually with horizontal or vertical orientation)." L. O'Gorman, "The document spectrum for page layout analysis," in IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 15, no. 11, pp. 1162-1173, Nov. 1993.
Image credit: PubLayNet: largest dataset ever for document layout analysis
Libraries
Use these libraries to find Document Layout Analysis models and implementationsMost implemented papers
DocBank: A Benchmark Dataset for Document Layout Analysis
DocBank is constructed using a simple yet effective way with weak supervision from the \LaTeX{} documents available on the arXiv. com.
Towards End-to-End Unified Scene Text Detection and Layout Analysis
In this paper, we bring them together and introduce the task of unified scene text detection and layout analysis.
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking
In this paper, we propose \textbf{LayoutLMv3} to pre-train multimodal Transformers for Document AI with unified text and image masking.
Multi-Task Handwritten Document Layout Analysis
Document Layout Analysis is a fundamental step in Handwritten Text Processing systems, from the extraction of the text lines to the type of zone it belongs to.
docExtractor: An off-the-shelf historical document element extraction
We present docExtractor, a generic approach for extracting visual elements such as text lines or illustrations from historical documents without requiring any real data annotation.
VSR: A Unified Framework for Document Layout Analysis combining Vision, Semantics and Relations
To address the above limitations, we propose a unified framework VSR for document layout analysis, combining vision, semantics and relations.
ICDAR 2021 Competition on Historical Map Segmentation
Task~2 consists in segmenting map content from the larger map sheet, and was won by the UWB team using a U-Net-like FCN combined with a binarization method to increase detection edge accuracy.
DocSynth: A Layout Guided Approach for Controllable Document Image Synthesis
The results highlight that our model can successfully generate realistic and diverse document images with multiple objects.
LayoutReader: Pre-training of Text and Layout for Reading Order Detection
Reading order detection is the cornerstone to understanding visually-rich documents (e. g., receipts and forms).
DocSegTr: An Instance-Level End-to-End Document Image Segmentation Transformer
has emerged as an interesting problem for the document analysis and understanding community.