no code implementations • LREC 2014 • Piotr Ba{\'n}ski, Nils Diewald, Michael Hanl, Marc Kupietz, Andreas Witt
We present an approach to an aspect of managing complex access scenarios to large and heterogeneous corpora that involves handling user queries that, intentionally or due to the complexity of the queried resource, target texts or annotations outside of the given userÂ’s permissions.
no code implementations • LREC 2016 • Nils Diewald, Michael Hanl, Eliza Margaretha, Joachim Bingel, Marc Kupietz, Piotr Ba{\'n}ski, Andreas Witt
KorAP is a corpus search and analysis platform, developed at the Institute for the German Language (IDS).
no code implementations • LREC 2020 • Marc Kupietz, Nils Diewald, Eliza Margaretha
Making corpora accessible and usable for linguistic research is a huge challenge in view of (too) big data, legal issues and a rapidly evolving methodology.
no code implementations • CMLC (LREC) 2022 • Nils Diewald
This paper presents an algorithm and implementation for efficient tokenization of space-delimited languages based on a deterministic finite state automaton.