Search Results for author: Mladen Karan

Found 17 papers, 1 papers with code

Mitigating Topic Bias when Detecting Decisions in Dialogue

no code implementations • SIGDIAL (ACL) 2021 • Mladen Karan, Prashant Khare, Patrick Healey, Matthew Purver

This work revisits the task of detecting decision-related utterances in multi-party dialogue.

Paper
Add Code

CoRAL: a Context-aware Croatian Abusive Language Dataset

no code implementations • 11 Nov 2022 • Ravi Shekhar, Mladen Karan, Matthew Purver

In light of unprecedented increases in the popularity of the internet and social media, comment moderation has never been a more relevant task.

Abusive Language

Paper
Add Code

Not All Comments are Equal: Insights into Comment Moderation from a Topic-Aware Model

1 code implementation • RANLP 2021 • Elaine Zosa, Ravi Shekhar, Mladen Karan, Matthew Purver

Moderation of reader comments is a significant problem for online news platforms.

Paper
Code

XHate-999: Analyzing and Detecting Abusive Language Across Domains and Languages

no code implementations • COLING 2020 • Goran Glava{\v{s}}, Mladen Karan, Ivan Vuli{\'c}

We present XHate-999, a multi-domain and multilingual evaluation data set for abusive language detection.

Abusive Language Disentanglement +2

Paper
Add Code

Classification-Based Self-Learning for Weakly Supervised Bilingual Lexicon Induction

no code implementations • ACL 2020 • Mladen Karan, Ivan Vuli{\'c}, Anna Korhonen, Goran Glava{\v{s}}

Effective projection-based cross-lingual word embedding (CLWE) induction critically relies on the iterative self-learning procedure.

Bilingual Lexicon Induction Classification +3

Paper
Add Code

PANDORA Talks: Personality and Demographics on Reddit

no code implementations • NAACL (SocialNLP) 2021 • Matej Gjurković, Mladen Karan, Iva Vukojević, Mihaela Bošnjak, Jan Šnajder

Personality and demographics are important variables in social sciences, while in NLP they can aid in interpretability and removal of societal biases.

Gender Classification

Paper
Add Code

Preemptive Toxic Language Detection in Wikipedia Comments Using Thread-Level Context

no code implementations • WS 2019 • Mladen Karan, Jan {\v{S}}najder

We address the task of automatically detecting toxic content in user generated texts.

Paper
Add Code

Data Set for Stance and Sentiment Analysis from User Comments on Croatian News

no code implementations • WS 2019 • Mihaela Bo{\v{s}}njak, Mladen Karan

Nowadays it is becoming more important than ever to find new ways of extracting useful information from the evergrowing amount of user-generated data available online.

BIG-bench Machine Learning Sentiment Analysis

Paper
Add Code

Cross-Domain Detection of Abusive Language Online

no code implementations • WS 2018 • Mladen Karan, Jan {\v{S}}najder

We investigate to what extent the models trained to detect general abusive language generalize between different datasets labeled with different abusive language types.

Abusive Language Domain Adaptation +1

Paper
Add Code

Combining Shallow and Deep Learning for Aggressive Text Detection

no code implementations • COLING 2018 • Viktor Golem, Mladen Karan, Jan {\v{S}}najder

The task, however, is far from being trivial, as what is considered as aggressive speech can be quite subjective, and the task is further complicated by the noisy nature of user-generated text on social networks.

BIG-bench Machine Learning Text Detection

Paper
Add Code

TakeLab-QA at SemEval-2017 Task 3: Classification Experiments for Answer Retrieval in Community QA

no code implementations • SEMEVAL 2017 • Filip {\v{S}}aina, Toni Kukurin, Lukrecija Pulji{\'c}, Mladen Karan, Jan {\v{S}}najder

We use features based on different semantic similarity models (e. g., Latent Dirichlet Allocation), as well as features based on several types of pre-trained word embeddings.

Community Question Answering General Classification +6