no code implementations • gwll (LREC) 2022 • Christian Chiarcos, Katerina Gkirtzou, Maxim Ionov, Besim Kabashi, Fahad Khan, Ciprian-Octavian Truică
Following presentations of frequency and attestations, and embeddings and distributional similarity, this paper introduces the third cornerstone of the emerging OntoLex module for Frequency, Attestation and Corpus-based Information, OntoLex-FrAC.
no code implementations • gwll (LREC) 2022 • Jorge Gracia, Besim Kabashi, Ilan Kernerman
The objective of the Translation Inference Across Dictionaries (TIAD) series of shared tasks is to explore and compare methods and techniques that infer translations indirectly between language pairs, based on other bilingual/multilingual lexicographic resources.
no code implementations • COLING 2022 • Christian Chiarcos, Elena-Simona Apostol, Besim Kabashi, Ciprian-Octavian Truică
OntoLex-Lemon has become a de facto standard for lexical resources in the web of data.
no code implementations • LREC 2020 • Thomas Proisl, Natalie Dykes, Philipp Heinrich, Besim Kabashi, Andreas Blombach, Stefan Evert
The EmpiriST corpus (Bei{\ss}wenger et al., 2016) is a manually tokenized and part-of-speech tagged corpus of approximately 23, 000 tokens of German Web and CMC (computer-mediated communication) data.
no code implementations • LREC 2020 • Andreas Blombach, Natalie Dykes, Philipp Heinrich, Besim Kabashi, Thomas Proisl
GeRedE is a 270 million token German CMC corpus containing approximately 380, 000 submissions and 6, 800, 000 comments posted on Reddit between 2010 and 2018.
1 code implementation • WS 2018 • Thomas Proisl, Philipp Heinrich, Besim Kabashi, Stefan Evert
EmotiKLUE is a submission to the Implicit Emotion Shared Task.
no code implementations • LREC 2016 • Besim Kabashi, Thomas Proisl
Part-of-speech tagging is a basic step in Natural Language Processing that is often essential.