Search Results for author: Christopher Kermorvant

Found 20 papers, 1 papers with code

Callico: a Versatile Open-Source Document Image Annotation Platform

no code implementations2 May 2024 Christopher Kermorvant, Eva Bardou, Manon Blanco, Bastien Abadie

This paper presents Callico, a web-based open source platform designed to simplify the annotation process in document recognition projects.

Document Layout Analysis HTR +3

Revisiting N-Gram Models: Their Impact in Modern Neural Networks for Handwritten Text Recognition

no code implementations30 Apr 2024 Solène Tarride, Christopher Kermorvant

In recent advances in automatic text recognition (ATR), deep neural networks have demonstrated the ability to implicitly capture language statistics, potentially reducing the need for traditional language models.

Handwriting Recognition Handwritten Text Recognition +1

Improving Automatic Text Recognition with Language Models in the PyLaia Open-Source Library

no code implementations29 Apr 2024 Solène Tarride, Yoann Schneider, Marie Generali-Lince, Mélodie Boillet, Bastien Abadie, Christopher Kermorvant

PyLaia is one of the most popular open-source software for Automatic Text Recognition (ATR), delivering strong performance in terms of speed and accuracy.

Language Modelling

The Socface Project: Large-Scale Collection, Processing, and Analysis of a Century of French Censuses

no code implementations29 Apr 2024 Mélodie Boillet, Solène Tarride, Yoann Schneider, Bastien Abadie, Lionel Kesztenbaum, Christopher Kermorvant

For this project, we developed a complete processing workflow: large-scale data collection from French departmental archives, collaborative annotation of documents, training of handwritten table text and structure recognition models, and mass processing of millions of images.

Table Recognition

How to Choose Pretrained Handwriting Recognition Models for Single Writer Fine-Tuning

no code implementations4 May 2023 Vittorio Pippi, Silvia Cascianelli, Christopher Kermorvant, Rita Cucchiara

Recent advancements in Deep Learning-based Handwritten Text Recognition (HTR) have led to models with remarkable performance on both modern and historical manuscripts in large benchmark datasets.

Handwriting Recognition Handwritten Text Recognition +2

Confidence Estimation for Object Detection in Document Images

no code implementations29 Aug 2022 Mélodie Boillet, Christopher Kermorvant, Thierry Paquet

In the active learning framework, the three first estimators show a significant improvement in performance for the detection of document physical pages and text lines compared to a random selection of images.

Active Learning Descriptive +3

The LAM Dataset: A Novel Benchmark for Line-Level Handwritten Text Recognition

no code implementations16 Aug 2022 Silvia Cascianelli, Vittorio Pippi, Martin Maarand, Marcella Cornia, Lorenzo Baraldi, Christopher Kermorvant, Rita Cucchiara

With the aim of fostering the research on this topic, in this paper we present the Ludovico Antonio Muratori (LAM) dataset, a large line-level HTR dataset of Italian ancient manuscripts edited by a single author over 60 years.

Handwritten Text Recognition HTR

Robust Text Line Detection in Historical Documents: Learning and Evaluation Methods

no code implementations23 Mar 2022 Mélodie Boillet, Christopher Kermorvant, Thierry Paquet

We present a study conducted using three state-of-the-art systems Doc-UFCN, dhSegment and ARU-Net and show that it is possible to build generic models trained on a wide variety of historical document datasets that can correctly segment diverse unseen pages.

document understanding Line Detection +1

Including Keyword Position in Image-based Models for Act Segmentation of Historical Registers

no code implementations17 Sep 2021 Mélodie Boillet, Martin Maarand, Thierry Paquet, Christopher Kermorvant

However, the segmentation of complex documents into semantic regions is sometimes impossible relying only on visual features and recent models embed both visual and textual information.

Position

HORAE: an annotated dataset of books of hours

no code implementations1 Dec 2020 Mélodie Boillet, Marie-Laurence Bonhomme, Dominique Stutzmann, Christopher Kermorvant

We introduce in this paper a new dataset of annotated pages from books of hours, a type of handwritten prayer books owned and used by rich lay people in the late middle ages.

Line Detection

Hierarchical Text Segmentation for Medieval Manuscripts

1 code implementation COLING 2020 Amir Hazem, Beatrice Daille, Dominique Stutzmann, Christopher Kermorvant, Louis Chevalier

In this paper, we address the segmentation of books of hours, Latin devotional manuscripts of the late Middle Ages, that exhibit challenging issues: a complex hierarchical entangled structure, variable content, noisy transcriptions with no sentence markers, and strong correlations between sections for which topical information is no longer sufficient to draw segmentation boundaries.

Hierarchical Text Segmentation Segmentation +2

Full-Page Text Recognition: Learning Where to Start and When to Stop

no code implementations27 Apr 2017 Bastien Moysset, Christopher Kermorvant, Christian Wolf

Text line detection and localization is a crucial step for full page document analysis, but still suffers from heterogeneity of real life documents.

Line Detection Position

Curriculum Learning for Handwritten Text Line Recognition

no code implementations5 Dec 2013 Jérôme Louradour, Christopher Kermorvant

Recurrent Neural Networks (RNN) have recently achieved the best performance in off-line Handwriting Text Recognition.

Dropout improves Recurrent Neural Networks for Handwriting Recognition

no code implementations5 Nov 2013 Vu Pham, Théodore Bluche, Christopher Kermorvant, Jérôme Louradour

Recurrent neural networks (RNNs) with Long Short-Term memory cells currently hold the best known results in unconstrained handwriting recognition.

Handwriting Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.