no code implementations • COLING (PEOPLES) 2020 • Nikola Ljubešić, Ilia Markov, Darja Fišer, Walter Daelemans
We further showcase the usage of the lexicons by calculating the difference in emotion distributions in texts containing and not containing socially unacceptable discourse, comparing them across four languages (English, Croatian, Dutch, Slovene) and two topics (migrants and LGBT).
no code implementations • EACL (WASSA) 2021 • Ilia Markov, Nikola Ljubešić, Darja Fišer, Walter Daelemans
In this paper, we describe experiments designed to evaluate the impact of stylometric and emotion-based features on hate speech detection: the task of classifying textual content into hate or non-hate speech classes.
no code implementations • ParlaCLARIN (LREC) 2022 • Maciej Ogrodniczuk, Petya Osenova, Tomaž Erjavec, Darja Fišer, Nikola Ljubešić, Çağrı Çöltekin, Matyáš Kopp, Meden Katja
In ParlaMint I, a CLARIN-ERIC supported project in pandemic times, a set of comparable and uniformly annotated multilingual corpora for 17 national parliaments were developed and released in 2021.
no code implementations • ParlaCLARIN (LREC) 2022 • Jure Skubic, Darja Fišer
One of the major sociological research interests has always been the study of political discourse.
no code implementations • 5 Jun 2019 • Nikola Ljubešić, Darja Fišer, Tomaž Erjavec
In this paper we present datasets of Facebook comment threads to mainstream media posts in Slovene and English developed inside the Slovene national project FRENK which cover two topics, migrants and LGBT, and are manually annotated for different types of socially unacceptable discourse (SUD).
no code implementations • 5 Jun 2019 • Nikola Ljubešić, Darja Fišer, Tomaž Erjavec
This paper presents a dataset and supervised learning experiments for term extraction from Slovene academic texts.
1 code implementation • 9 Jul 2018 • Nikola Ljubešić, Darja Fišer, Anita Peti-Stantić
We show that the notions of concreteness and imageability are highly predictable both within and across languages, with a moderate loss of up to 20% in correlation when predicting across languages.