Abuse Detection
30 papers with code • 0 benchmarks • 4 datasets
Abuse detection is the task of identifying abusive behaviors, such as hate speech, offensive language, sexism and racism, in utterances from social media platforms (Source: https://arxiv.org/abs/1802.00385).
Benchmarks
These leaderboards are used to track progress in Abuse Detection
Latest papers
Breaking the Silence Detecting and Mitigating Gendered Abuse in Hindi, Tamil, and Indian English Online Spaces
Online gender-based harassment is a widespread issue limiting the free expression and participation of women and marginalized genders in digital spaces.
TCAB: A Large-Scale Text Classification Attack Benchmark
In addition to the primary tasks of detecting and labeling attacks, TCAB can also be used for attack localization, attack target labeling, and attack characterization.
Explainable Abuse Detection as Intent Classification and Slot Filling
To proactively offer social media users a safe online experience, there is a need for systems that can detect harmful posts and promptly alert platform moderators.
Improving Generalizability in Implicitly Abusive Language Detection with Concept Activation Vectors
Robustness of machine learning models on ever-changing real-world data is critical, especially for applications affecting human well-being such as content moderation.
Entropy-based Attention Regularization Frees Unintended Bias Mitigation from Lists
EAR also reveals overfitting terms, i. e., terms most likely to induce bias, to help identify their effect on the model, task, and predictions.
ADIMA: Abuse Detection In Multilingual Audio
Abusive content detection in spoken text can be addressed by performing Automatic Speech Recognition (ASR) and leveraging advancements in natural language processing.
ConvAbuse: Data, Analysis, and Benchmarks for Nuanced Abuse Detection in Conversational AI
We find that the distribution of abuse is vastly different compared to other commonly used datasets, with more sexually tinted aggression towards the virtual persona of these systems.
AAA: Fair Evaluation for Abuse Detection Systems Wanted
In this work, we introduce Adversarial Attacks against Abuse (AAA), a new evaluation strategy and associated metric that better captures a model’s performance on certain classes of hard-to-classify microposts, and for example penalises systems which are biased on low-level lexical features.
AbuseAnalyzer: Abuse Detection, Severity and Target Prediction for Gab Posts
While extensive popularity of online social media platforms has made information dissemination faster, it has also resulted in widespread online abuse of different types like hate speech, offensive language, sexist and racist opinions, etc.
KUISAIL at SemEval-2020 Task 12: BERT-CNN for Offensive Speech Identification in Social Media
In this paper, we describe our approach to utilize pre-trained BERT models with Convolutional Neural Networks for sub-task A of the Multilingual Offensive Language Identification shared task (OffensEval 2020), which is a part of the SemEval 2020.