Search Results for author: Besim Kabashi

Found 13 papers, 1 papers with code

Modelling Collocations in OntoLex-FrAC

no code implementations gwll (LREC) 2022 Christian Chiarcos, Katerina Gkirtzou, Maxim Ionov, Besim Kabashi, Fahad Khan, Ciprian-Octavian Truică

Following presentations of frequency and attestations, and embeddings and distributional similarity, this paper introduces the third cornerstone of the emerging OntoLex module for Frequency, Attestation and Corpus-based Information, OntoLex-FrAC.

TIAD 2022: The Fifth Translation Inference Across Dictionaries Shared Task

no code implementations gwll (LREC) 2022 Jorge Gracia, Besim Kabashi, Ilan Kernerman

The objective of the Translation Inference Across Dictionaries (TIAD) series of shared tasks is to explore and compare methods and techniques that infer translations indirectly between language pairs, based on other bilingual/multilingual lexicographic resources.

Translation

EmpiriST Corpus 2.0: Adding Manual Normalization, Lemmatization and Semantic Tagging to a German Web and CMC Corpus

no code implementations LREC 2020 Thomas Proisl, Natalie Dykes, Philipp Heinrich, Besim Kabashi, Andreas Blombach, Stefan Evert

The EmpiriST corpus (Bei{\ss}wenger et al., 2016) is a manually tokenized and part-of-speech tagged corpus of approximately 23, 000 tokens of German Web and CMC (computer-mediated communication) data.

Lemmatization

A Corpus of German Reddit Exchanges (GeRedE)

no code implementations LREC 2020 Andreas Blombach, Natalie Dykes, Philipp Heinrich, Besim Kabashi, Thomas Proisl

GeRedE is a 270 million token German CMC corpus containing approximately 380, 000 submissions and 6, 800, 000 comments posted on Reddit between 2010 and 2018.

Cannot find the paper you are looking for? You can Submit a new open access paper.