no code implementations • NAACL (NLP4IF) 2021 • Tulika Bose, Irina Illina, Dominique Fohr
Rapidly changing social media content calls for robust and generalisable abuse detection models.
no code implementations • NAACL (SocialNLP) 2021 • Tulika Bose, Irina Illina, Dominique Fohr
The state-of-the-art abusive language detection models report great in-corpus performance, but underperform when evaluated on abusive comments that differ from the training scenario.
no code implementations • 17 Oct 2022 • Tulika Bose, Irina Illina, Dominique Fohr
The concerning rise of hateful content on online platforms has increased the attention towards automatic hate speech detection, commonly formulated as a supervised classification task.
no code implementations • COLING 2022 • Tulika Bose, Nikolaos Aletras, Irina Illina, Dominique Fohr
State-of-the-art approaches for hate-speech detection usually exhibit poor performance in out-of-domain settings.
1 code implementation • Findings (ACL) 2022 • Tulika Bose, Nikolaos Aletras, Irina Illina, Dominique Fohr
In this paper, we propose to automatically identify and reduce spurious correlations using attribution methods with dynamic refinement of the list of terms that need to be regularized during training.