Despite recent advances in automatic text recognition, the performance remains moderate when it comes to historical manuscripts.
Geometric Deep Learning has recently attracted significant interest in a wide range of machine learning fields, including document analysis.
Date estimation of historical document images is a challenging problem, with several contributions in the literature that lack of the ability to generalize from one dataset to others.
In this paper, we propose a Text-Degradation Invariant Auto Encoder (Text-DIAE), a self-supervised model designed to tackle two tasks, text recognition (handwritten or scene-text) and document image enhancement.
has emerged as an interesting problem for the document analysis and understanding community.
Document images can be affected by many degradation scenarios, which cause recognition and processing difficulties.
Ranked #1 on Binarization on H-DIBCO 2018
The results highlight that our model can successfully generate realistic and diverse document images with multiple objects.
This paper presents a novel method for date estimation of historical photographs from archival sources.
In this paper, we explore and evaluate the use of ranking-based objective functions for learning simultaneously a word string and a word image encoder.
Low resource Handwritten Text Recognition (HTR) is a hard problem due to the scarce annotated data and the very limited linguistic information (dictionaries and language models).
The emergence of geometric deep learning as a novel framework to deal with graph-based representations has faded away traditional approaches in favor of completely new methodologies.
In this work we propose an end-to-end model that combines a one stage object detection network with branches for the recognition of text and named entities respectively in a way that shared features can be learned simultaneously from the training error of each of the tasks.
Graph embedding, which maps graphs to a vectorial space, has been proposed as a way to tackle these difficulties enabling the use of standard machine learning techniques.
In this work we introduce a cross modal image retrieval system that allows both text and sketch as input modalities for the query.
When extracting information from handwritten documents, text transcription and named entity recognition are usually faced as separate subsequent tasks.
This paper presents a novel application to detect counterfeit identity documents forged by a scan-printing operation.
Many algorithms formulate graph matching as an optimization of an objective function of pairwise quantification of nodes and edges of two graphs to be matched.