Search Results for author: Frank Puppe

Found 15 papers, 7 papers with code

The FairyNet Corpus - Character Networks for German Fairy Tales

no code implementations EMNLP (LaTeCHCLfL, CLFL, LaTeCH) 2021 David Schmidt, Albin Zehe, Janne Lorenzen, Lisa Sergel, Sebastian Düker, Markus Krug, Frank Puppe

The release of this corpus provides an opportunity of training and comparing different algorithms for the extraction of character networks, which so far was barely possible due to heterogeneous interests of previous researchers.

Detecting Scenes in Fiction: A new Segmentation Task

no code implementations EACL 2021 Albin Zehe, Leonard Konle, Lea Katharina D{\"u}mpelmann, Evelyn Gius, Andreas Hotho, Fotis Jannidis, Lucas Kaufmann, Markus Krug, Frank Puppe, Nils Reiter, Annekea Schreiber, Nathalie Wiedmer

This paper introduces the novel task of scene segmentation on narrative texts and provides an annotated corpus, a discussion of the linguistic and narrative properties of the task and baseline experiments towards automatic solutions.

coreference-resolution Scene Segmentation +1

OCR4all -- An Open-Source Tool Providing a (Semi-)Automatic OCR Workflow for Historical Printings

no code implementations9 Sep 2019 Christian Reul, Dennis Christ, Alexander Hartelt, Nico Balbach, Maximilian Wehner, Uwe Springmann, Christoph Wick, Christine Grundig, Andreas Büttner, Frank Puppe

Nevertheless, in the last few years great progress has been made in the area of historical OCR, resulting in several powerful open-source tools for preprocessing, layout recognition and segmentation, character recognition and post-processing.

Optical Character Recognition Optical Character Recognition (OCR)

State of the Art Optical Character Recognition of 19th Century Fraktur Scripts using Open Source Engines

1 code implementation8 Oct 2018 Christian Reul, Uwe Springmann, Christoph Wick, Frank Puppe

In this paper we evaluate Optical Character Recognition (OCR) of 19th century Fraktur scripts without book-specific training using mixed models, i. e. models trained to recognize a variety of fonts and typesets from previously unseen sources.

Optical Character Recognition Optical Character Recognition (OCR)

Improving OCR Accuracy on Early Printed Books by combining Pretraining, Voting, and Active Learning

1 code implementation27 Feb 2018 Christian Reul, Uwe Springmann, Christoph Wick, Frank Puppe

We combine three methods which significantly improve the OCR accuracy of OCR models trained on early printed books: (1) The pretraining method utilizes the information stored in already existing models trained on a variety of typesets (mixed models) instead of starting the training from scratch.

Active Learning

Improving OCR Accuracy on Early Printed Books using Deep Convolutional Networks

1 code implementation27 Feb 2018 Christoph Wick, Christian Reul, Frank Puppe

This paper proposes a combination of a convolutional and a LSTM network to improve the accuracy of OCR on early printed books.

Optical Character Recognition (OCR)

Transfer Learning for OCRopus Model Training on Early Printed Books

1 code implementation15 Dec 2017 Christian Reul, Christoph Wick, Uwe Springmann, Frank Puppe

The evaluation on seven early printed books showed that training from the Latin mixed model reduces the average amount of errors by 43% and 26%, respectively compared to training from scratch with 60 and 150 lines of ground truth, respectively.

Optical Character Recognition (OCR) Transfer Learning

Leaf Identification Using a Deep Convolutional Neural Network

no code implementations4 Dec 2017 Christoph Wick, Frank Puppe

Convolutional neural networks (CNNs) have become popular especially in computer vision in the last few years because they achieved outstanding performance on different tasks, such as image classifications.

Data Augmentation General Classification +1

Improving OCR Accuracy on Early Printed Books by utilizing Cross Fold Training and Voting

1 code implementation27 Nov 2017 Christian Reul, Uwe Springmann, Christoph Wick, Frank Puppe

Experiments on seven early printed books show that the proposed method outperforms the standard approach considerably by reducing the amount of errors by up to 50% and more.

Optical Character Recognition (OCR)

Fully Convolutional Neural Networks for Page Segmentation of Historical Document Images

no code implementations21 Nov 2017 Christoph Wick, Frank Puppe

For evaluation of this model we introduce a novel metric that is independent of ambiguous ground truth called Foreground Pixel Accuracy (FgPA).

Segmentation

Cannot find the paper you are looking for? You can Submit a new open access paper.