no code implementations • LREC 2012 • Piotr Ba{\'n}ski, Peter M. Fischer, Elena Frick, Erik Ketzan, Marc Kupietz, Carsten Schnober, Oliver Schonefeld, Andreas Witt
The aim of this project is to develop an innovative corpus analysis platform to tackle the increasing demands of modern linguistic research.
no code implementations • LREC 2014 • Piotr Ba{\'n}ski, Nils Diewald, Michael Hanl, Marc Kupietz, Andreas Witt
We present an approach to an aspect of managing complex access scenarios to large and heterogeneous corpora that involves handling user queries that, intentionally or due to the complexity of the queried resource, target texts or annotations outside of the given userÂ’s permissions.
no code implementations • LREC 2014 • Marc Kupietz, Harald L{\"u}ngen
This paper gives an overview of recent developments in the German Reference Corpus DeReKo in terms of growth, maximising relevant corpus strata, metadata, legal issues, and its current and future research interface.
no code implementations • LREC 2016 • Nils Diewald, Michael Hanl, Eliza Margaretha, Joachim Bingel, Marc Kupietz, Piotr Ba{\'n}ski, Andreas Witt
KorAP is a corpus search and analysis platform, developed at the Institute for the German Language (IDS).
no code implementations • LREC 2020 • Marc Kupietz, Nils Diewald, Eliza Margaretha
Making corpora accessible and usable for linguistic research is a huge challenge in view of (too) big data, legal issues and a rapidly evolving methodology.
no code implementations • LREC 2020 • Denis Arnold, Bernhard Fisseni, Pawel Kamocki, Oliver Schonefeld, Marc Kupietz, Thomas Schmidt
This paper addresses long-term archival for large corpora.
no code implementations • LREC 2020 • Peter Fankhauser, Bich-Ngoc Do, Marc Kupietz
We evaluate a graph-based dependency parser on DeReKo, a large corpus of contemporary German.
no code implementations • ACL (MWE) 2021 • Miriam Amin, Peter Fankhauser, Marc Kupietz, Roman Schneider
The automatic recognition of idioms poses a challenging problem for NLP applications.
no code implementations • CMLC (LREC) 2022 • Peter Fankhauser, Marc Kupietz
We present the use of count-based and predictive language models for exploring language use in the German Reference Corpus DeReKo.