Search Results for author: Uwe Springmann

Found 10 papers, 5 papers with code

Open Source Handwritten Text Recognition on Medieval Manuscripts using Mixed Models and Document-Specific Finetuning

no code implementations • 19 Jan 2022 • Christian Reul, Stefan Tomasek, Florian Langhanki, Uwe Springmann

We report on our efforts to construct mixed recognition models which can be applied out-of-the-box without any further document-specific training but also serve as a starting point for finetuning by training a new model on a few pages of transcribed text (ground truth).

Handwritten Text Recognition HTR

Paper
Add Code

Mixed Model OCR Training on Historical Latin Script for Out-of-the-Box Recognition and Finetuning

no code implementations • 15 Jun 2021 • Christian Reul, Christoph Wick, Maximilian Nöth, Andreas Büttner, Maximilian Wehner, Uwe Springmann

Training a more specialized model for some unseen Early Modern Latin books starting from our mixed model led to a CER of 1. 47%, an improvement of up to 50% compared to training from scratch and up to 30% compared to training from the aforementioned standard model.

Data Augmentation Optical Character Recognition +1

Paper
Add Code

OCR4all -- An Open-Source Tool Providing a (Semi-)Automatic OCR Workflow for Historical Printings

no code implementations • 9 Sep 2019 • Christian Reul, Dennis Christ, Alexander Hartelt, Nico Balbach, Maximilian Wehner, Uwe Springmann, Christoph Wick, Christine Grundig, Andreas Büttner, Frank Puppe

Nevertheless, in the last few years great progress has been made in the area of historical OCR, resulting in several powerful open-source tools for preprocessing, layout recognition and segmentation, character recognition and post-processing.

Optical Character Recognition Optical Character Recognition (OCR)

Paper
Add Code

State of the Art Optical Character Recognition of 19th Century Fraktur Scripts using Open Source Engines

1 code implementation • 8 Oct 2018 • Christian Reul, Uwe Springmann, Christoph Wick, Frank Puppe

In this paper we evaluate Optical Character Recognition (OCR) of 19th century Fraktur scripts without book-specific training using mixed models, i. e. models trained to recognize a variety of fonts and typesets from previously unseen sources.

Optical Character Recognition Optical Character Recognition (OCR)

Paper
Code

Ground Truth for training OCR engines on historical documents in German Fraktur and Early Modern Latin

no code implementations • 14 Sep 2018 • Uwe Springmann, Christian Reul, Stefanie Dipper, Johannes Baiter

In this paper we describe a dataset of German and Latin \textit{ground truth} (GT) for historical OCR in the form of printed text line images paired with their transcription.

Optical Character Recognition (OCR)

Paper
Add Code

Improving OCR Accuracy on Early Printed Books by combining Pretraining, Voting, and Active Learning

1 code implementation • 27 Feb 2018 • Christian Reul, Uwe Springmann, Christoph Wick, Frank Puppe

We combine three methods which significantly improve the OCR accuracy of OCR models trained on early printed books: (1) The pretraining method utilizes the information stored in already existing models trained on a variety of typesets (mixed models) instead of starting the training from scratch.

Active Learning

Paper
Code

Transfer Learning for OCRopus Model Training on Early Printed Books

1 code implementation • 15 Dec 2017 • Christian Reul, Christoph Wick, Uwe Springmann, Frank Puppe

The evaluation on seven early printed books showed that training from the Latin mixed model reduces the average amount of errors by 43% and 26%, respectively compared to training from scratch with 60 and 150 lines of ground truth, respectively.

Optical Character Recognition (OCR) Transfer Learning

Paper
Code

Improving OCR Accuracy on Early Printed Books by utilizing Cross Fold Training and Voting

1 code implementation • 27 Nov 2017 • Christian Reul, Uwe Springmann, Christoph Wick, Frank Puppe

Experiments on seven early printed books show that the proposed method outperforms the standard approach considerably by reducing the amount of errors by up to 50% and more.

Optical Character Recognition (OCR)

Paper
Code

LAREX - A semi-automatic open-source Tool for Layout Analysis and Region Extraction on Early Printed Books

2 code implementations • 20 Jan 2017 • Christian Reul, Uwe Springmann, Frank Puppe

A semi-automatic open-source tool for layout analysis on early printed books is presented.

Optical Character Recognition (OCR)

173

Paper
Code

Profiling of OCR'ed Historical Texts Revisited

no code implementations • 19 Jan 2017 • Florian Fink, Klaus-U. Schulz, Uwe Springmann

Here we improve this method in three respects: First, the method in Reffle (2013) is not adaptive: user feedback obtained by actual postcorrection steps cannot be used to compute refined profiles.

Optical Character Recognition (OCR)

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.