no code implementations • LREC 2020 • Thomas Proisl, Natalie Dykes, Philipp Heinrich, Besim Kabashi, Andreas Blombach, Stefan Evert
The EmpiriST corpus (Bei{\ss}wenger et al., 2016) is a manually tokenized and part-of-speech tagged corpus of approximately 23, 000 tokens of German Web and CMC (computer-mediated communication) data.
no code implementations • LREC 2020 • Andreas Blombach, Natalie Dykes, Philipp Heinrich, Besim Kabashi, Thomas Proisl
GeRedE is a 270 million token German CMC corpus containing approximately 380, 000 submissions and 6, 800, 000 comments posted on Reddit between 2010 and 2018.
1 code implementation • WS 2018 • Thomas Proisl, Philipp Heinrich, Besim Kabashi, Stefan Evert
EmotiKLUE is a submission to the Implicit Emotion Shared Task.
no code implementations • LREC 2016 • Besim Kabashi, Thomas Proisl
Part-of-speech tagging is a basic step in Natural Language Processing that is often essential.
no code implementations • LREC 2012 • Thomas Proisl, Peter Uhrig
State-of-the-art dependency representations such as the Stanford Typed Dependencies may represent the grammatical relations in a sentence as directed, possibly cyclic graphs.