no code implementations • 30 Apr 2024 • Solène Tarride, Christopher Kermorvant
In recent advances in automatic text recognition (ATR), deep neural networks have demonstrated the ability to implicitly capture language statistics, potentially reducing the need for traditional language models.
no code implementations • 29 Apr 2024 • Solène Tarride, Yoann Schneider, Marie Generali-Lince, Mélodie Boillet, Bastien Abadie, Christopher Kermorvant
PyLaia is one of the most popular open-source software for Automatic Text Recognition (ATR), delivering strong performance in terms of speed and accuracy.
no code implementations • 29 Apr 2024 • Mélodie Boillet, Solène Tarride, Yoann Schneider, Bastien Abadie, Lionel Kesztenbaum, Christopher Kermorvant
For this project, we developed a complete processing workflow: large-scale data collection from French departmental archives, collaborative annotation of documents, training of handwritten table text and structure recognition models, and mass processing of millions of images.
no code implementations • 29 Apr 2024 • David Villanova-Aparisi, Solène Tarride, Carlos-D. Martínez-Hinarejos, Verónica Romero, Christopher Kermorvant, Moisés Pastor-Gadea
In this paper, we propose and publicly release a set of reading order independent metrics tailored to Information Extraction evaluation in handwritten documents.
no code implementations • International Workshop on Historical Document Imaging and Processing 2023 • Solène Tarride, Tristan Faine, Mélodie Boillet, Harold Mouchère, Christopher Kermorvant
However, selecting training samples based on the degree of agreement between annotators introduces a bias in the training data and does not improve the results.
Ranked #1 on Handwritten Text Recognition on Belfort
no code implementations • 27 Apr 2023 • Solène Tarride, Martin Maarand, Mélodie Boillet, James McGrath, Eugénie Capel, Hélène Vézina, Christopher Kermorvant
Verification of the birth and death acts from this sample shows that 74% of them are considered complete and valid.
no code implementations • 26 Apr 2023 • Solène Tarride, Mélodie Boillet, Christopher Kermorvant
We propose a Transformer-based approach for information extraction from digitized handwritten documents.
no code implementations • 26 Apr 2023 • Solène Tarride, Mélodie Boillet, Jean-François Moufflet, Christopher Kermorvant
We propose a new database for information extraction from historical handwritten documents.
Ranked #1 on Key Information Extraction on SIMARA