no code implementations • LREC 2022 • Elisabeth Eder, Michael Wiegand, Ulrike Krieg-Holz, Udo Hahn
The exploding amount of user-generated content has spurred NLP research to deal with documents from various digital communication formats (tweets, chats, emails, etc.).
1 code implementation • EACL 2021 • Elisabeth Eder, Ulrike Krieg-Holz, Udo Hahn
To track different levels of formality in written discourse, we introduce a novel type of lexicon for the German language, with entries ordered by their degree of (in)formality.
no code implementations • LREC 2020 • Elisabeth Eder, Ulrike Krieg-Holz, Udo Hahn
The vast amount of social communication distributed over various electronic media channels (tweets, blogs, emails, etc.
no code implementations • RANLP 2019 • Elisabeth Eder, Ulrike Krieg-Holz, Udo Hahn
We deal with the pseudonymization of those stretches of text in emails that might allow to identify real individual persons.
no code implementations • WS 2019 • Elisabeth Eder, Ulrike Krieg-Holz, Udo Hahn
In this paper, we describe a workflow for the data-driven acquisition and semantic scaling of a lexicon that covers lexical items from the lower end of the German language register{---}terms typically considered as rough, vulgar or obscene.
no code implementations • LREC 2016 • Ulrike Krieg-Holz, Christian Schuschnig, Franz Matthies, Benjamin Redling, Udo Hahn
We introduce CODE ALLTAG, a text corpus composed of German-language e-mails.