1 code implementation • WS 2019 • Martin Riedl, Daniela Betz, Sebastian Pad{\'o}
This article focuses on the problem of identifying articles and recovering their text from within and across newspaper pages when OCR just delivers one text file per page.
no code implementations • CL 2018 • Martin Riedl, Chris Biemann
First, we introduce DRUID, which is a method for detecting MWEs.
no code implementations • ACL 2018 • Martin Riedl, Sebastian Pad{\'o}
We ask how to practically build a model for German named entity recognition (NER) that performs at the state of the art for both contemporary and historical texts, i. e., a big-data and a small-data scenario.
no code implementations • NAACL 2018 • Ahmed Elsafty, Martin Riedl, Chris Biemann
Detecting the similarity between job advertisements is important for job recommendation systems as it allows, for example, the application of item-to-item based recommendations.
no code implementations • IJCNLP 2017 • Seid Muhie Yimam, Sanja {\v{S}}tajner, Martin Riedl, Chris Biemann
Complex word identification (CWI) is an important task in text accessibility.
no code implementations • RANLP 2017 • Seid Muhie Yimam, Sanja {\v{S}}tajner, Martin Riedl, Chris Biemann
Complex Word Identification (CWI) is an important task in lexical simplification and text accessibility.
no code implementations • ACL 2014 • Sunny Mitra, Ritwik Mitra, Martin Riedl, Chris Biemann, Animesh Mukherjee, Pawan Goyal
In this paper, we propose an unsupervised method to identify noun sense changes based on rigorous analysis of time-varying text data available in the form of millions of digitized books.
no code implementations • LREC 2014 • Martin Riedl, Richard Steuer, Chris Biemann
This paper introduces a distributional thesaurus and sense clusters computed on the complete Google Syntactic N-grams, which is extracted from Google Books, a very large corpus of digitized books published between 1520 and 2008.