1 code implementation • 12 Jul 2024 • Thomas Constum, Pierrick Tranouez, Thierry Paquet
Despite this, these integrated approaches have not yet matched the performance of language models, when applied to information extraction in plain text.
2 code implementations • 20 May 2024 • Antonio Ríos-Vila, Jorge Calvo-Zaragoza, David Rizo, Thierry Paquet
In this paper, we present the first truly end-to-end approach for page-level OMR.
no code implementations • 30 Apr 2024 • Thomas Constum, Lucas Preel, Théo Larcher, Pierrick Tranouez, Thierry Paquet, Sandra Brée
The EXO-POPP project aims to establish a comprehensive database comprising 300, 000 marriage records from Paris and its suburbs, spanning the years 1880 to 1940, which are preserved in over 130, 000 scans of double pages.
1 code implementation • 12 Feb 2024 • Antonio Ríos-Vila, Jorge Calvo-Zaragoza, Thierry Paquet
State-of-the-art end-to-end Optical Music Recognition (OMR) has, to date, primarily been carried out using monophonic transcription techniques to handle complex score layouts, such as polyphony, often by resorting to simplifications or specific adaptations.
1 code implementation • 25 Jan 2023 • Denis Coquenet, Clément Chatelain, Thierry Paquet
Recent advances in handwritten text recognition enabled to recognize whole documents in an end-to-end way: the Document Attention Network (DAN) recognizes the characters one after the other through an attention-based prediction process until reaching the end of the document.
1 code implementation • 8 Sep 2022 • Paul Peseux, Maxime Berar, Thierry Paquet, Victor Nicollet
Categorical data are present in key areas such as health or supply chain, and this data require specific treatment.
no code implementations • 29 Aug 2022 • Mélodie Boillet, Christopher Kermorvant, Thierry Paquet
In the active learning framework, the three first estimators show a significant improvement in performance for the detection of document physical pages and text lines compared to a random selection of images.
1 code implementation • 23 Mar 2022 • Denis Coquenet, Clément Chatelain, Thierry Paquet
For the first time, we propose an end-to-end segmentation-free architecture for the task of handwritten document recognition: the Document Attention Network.
Ranked #1 on Handwritten Text Recognition on READ 2016
no code implementations • 23 Mar 2022 • Mélodie Boillet, Christopher Kermorvant, Thierry Paquet
We present a study conducted using three state-of-the-art systems Doc-UFCN, dhSegment and ARU-Net and show that it is possible to build generic models trained on a wide variety of historical document datasets that can correctly segment diverse unseen pages.
no code implementations • 17 Sep 2021 • Mélodie Boillet, Martin Maarand, Thierry Paquet, Christopher Kermorvant
However, the segmentation of complex documents into semantic regions is sometimes impossible relying only on visual features and recent models embed both visual and textual information.
1 code implementation • 17 Feb 2021 • Denis Coquenet, Clément Chatelain, Thierry Paquet
Unconstrained handwriting recognition is an essential task in document analysis.
Ranked #4 on Handwritten Text Recognition on READ2016(line-level)
no code implementations • 28 Dec 2020 • Mélodie Boillet, Christopher Kermorvant, Thierry Paquet
In this paper, we introduce a fully convolutional network for the document layout analysis task.
no code implementations • 9 Dec 2020 • Denis Coquenet, Yann Soullard, Clément Chatelain, Thierry Paquet
This has a direct influence on the training time of such architectures, with also a direct consequence on the time required to explore various architectures.
1 code implementation • 9 Dec 2020 • Denis Coquenet, Clément Chatelain, Thierry Paquet
Unconstrained handwritten text recognition is a major step in most document analysis tasks.
Ranked #5 on Handwritten Text Recognition on IAM(line-level)
1 code implementation • 7 Dec 2020 • Denis Coquenet, Clément Chatelain, Thierry Paquet
For each text line features, a decoder module recognizes the character sequence associated, leading to the recognition of a whole paragraph.
Ranked #2 on Handwritten Text Recognition on READ2016(line-level)
5 code implementations • 23 Jan 2019 • Yann Soullard, Cyprien Ruffino, Thierry Paquet
We report an extension of a Keras Model, called CTCModel, to perform the Connectionist Temporal Classification (CTC) in a transparent way.
no code implementations • 28 Aug 2018 • Wassim Swaileh, Yann Soullard, Thierry Paquet
This makes pos- sible the design of an end-to-end unified multilingual recognition system where both a single optical model and a single language model are trained on all the languages.
no code implementations • 22 Aug 2018 • Wassim Swaileh, Thierry Paquet
In this paper, we introduce a new modeling approach of texts for handwriting recognition based on syllables.
no code implementations • 24 Jul 2017 • Bruno Stuner, Clément Chatelain, Thierry Paquet
Offline handwritten text line recognition is a hard task that requires both an efficient optical character recognizer and language model.
no code implementations • 22 Dec 2016 • Bruno Stuner, Clément Chatelain, Thierry Paquet
State-of-the-art methods for handwriting recognition are based on Long Short Term Memory (LSTM) recurrent neural networks (RNN), which now provides very impressive character recognition performance.
no code implementations • 3 Oct 2012 • Thomas Palfray, David Hébert, Stéphane Nicolas, Pierrick Tranouez, Thierry Paquet
Our back-end system extracts the logical structure of the page to produce the informative units: the articles.