no code implementations • LREC 2022 • Martin Volk, Lukas Fischer, Patricia Scheurer, Bernard Silvan Schroffenegger, Raphael Schwitter, Phillip Ströbel, Benjamin Suter
This paper is based on a collection of 16th century letters from and to the Zurich reformer Heinrich Bullinger.
1 code implementation • ECNLP (ACL) 2022 • Tannon Kew, Martin Volk
In this work we examine the task of generating more specific responses for online reviews in the hospitality domain by identifying generic responses in the training data, filtering them and fine-tuning the generation model.
no code implementations • READI (LREC) 2022 • Renate Hauser, Jannis Vamvas, Sarah Ebling, Martin Volk
Simplified language news articles are being offered by specialized web portals in several countries.
no code implementations • LT4HALA (LREC) 2022 • Lukas Fischer, Patricia Scheurer, Raphael Schwitter, Martin Volk
This paper outlines our work in collecting training data for and developing a Latin–German Neural Machine Translation (NMT) system, for translating 16th century letters.
1 code implementation • 21 Mar 2022 • Phillip Benjamin Ströbel, Simon Clematide, Martin Volk, Tobias Hodel
We apply the TrOCR framework to real-world, historical manuscripts and show that TrOCR per se is a strong model, ideal for transfer learning.
1 code implementation • LREC 2022 • Phillip Benjamin Ströbel, Simon Clematide, Martin Volk, Raphael Schwitter, Tobias Hodel, David Schoch
The evaluation of Handwritten Text Recognition (HTR) models during their development is straightforward: because HTR is a supervised problem, the usual data split into training, validation, and test data sets allows the evaluation of models in terms of accuracy or error rates.
no code implementations • LREC 2020 • Andreas S{\"a}uberli, Sarah Ebling, Martin Volk
Automatic text simplification is an active research area, and there are first systems for English, Spanish, Portuguese, and Italian.
no code implementations • LREC 2020 • Phillip Benjamin Str{\"o}bel, Simon Clematide, Martin Volk
Recent advances in Optical Character Recognition (OCR) and Handwritten Text Recognition (HTR) have led to more accurate textrecognition of historical documents.
Handwritten Text Recognition Optical Character Recognition +1
no code implementations • RANLP 2019 • Tannon Kew, Anastassia Shaitarova, Isabel Meraner, Janis Goldzycher, Simon Clematide, Martin Volk
Geotagging historic and cultural texts provides valuable access to heritage data, enabling location-based searching and new geographically related discoveries.
no code implementations • WS 2019 • Samuel Läubli, Chantal Amrhein, Patrick Düggelin, Beatriz Gonzalez, Alena Zwahlen, Martin Volk
Neural machine translation (NMT) has set new quality standards in automatic translation, yet its effect on post-editing productivity is still pending thorough investigation.
1 code implementation • EMNLP 2018 • Samuel Läubli, Rico Sennrich, Martin Volk
Recent research suggests that neural machine translation achieves parity with professional human translation on the WMT Chinese--English news translation task.
no code implementations • LREC 2016 • Simon Clematide, Lenz Furrer, Martin Volk
Crowdsourcing approaches for post-correction of OCR output (Optical Character Recognition) have been successfully applied to several historic text collections.
Optical Character Recognition Optical Character Recognition (OCR)
no code implementations • 23 May 2014 • Kurt Winkler, Tobias Kuhn, Martin Volk
After being operational for two winter seasons, we assess here the quality of the produced texts based on an evaluation where participants rate real danger descriptions from both origins, the catalogue of phrases versus the manually written and translated texts.
no code implementations • LREC 2014 • Martin Volk, Johannes Gra{\"e}n, Elena Callegaro
Recent years have seen an increased interest in and availability of parallel corpora.
no code implementations • LREC 2014 • Thierry Etchegoyhen, Lindsay Bywood, Mark Fishel, Panayota Georgakopoulou, Jie Jiang, Gerard van Loenhout, Arantza del Pozo, Mirjam Sepesy Mau{\v{c}}ec, Anja Turner, Martin Volk
This article describes a large-scale evaluation of the use of Statistical Machine Translation for professional subtitling.
no code implementations • LREC 2012 • Volha Petukhova, Rodrigo Agerri, Mark Fishel, Sergio Penkale, Arantza del Pozo, Mirjam Sepesy Mau{\v{c}}ec, Andy Way, Panayota Georgakopoulou, Martin Volk
Subtitling and audiovisual translation have been recognized as areas that could greatly benefit from the introduction of Statistical Machine Translation (SMT) followed by post-editing, in order to increase efficiency of subtitle production process.