Hate Speech Detection
164 papers with code • 14 benchmarks • 39 datasets
Hate speech detection is the task of detecting if communication such as text, audio, and so on contains hatred and or encourages violence towards a person or a group of people. This is usually based on prejudice against 'protected characteristics' such as their ethnicity, gender, sexual orientation, religion, age et al. Some example benchmarks are ETHOS and HateXplain. Models can be evaluated with metrics like the F-score or F-measure.
Libraries
Use these libraries to find Hate Speech Detection models and implementationsDatasets
Subtasks
Latest papers with no code
Cross-Platform Hate Speech Detection with Weakly Supervised Causal Disentanglement
Content moderation faces a challenging task as social media's ability to spread hate speech contrasts with its role in promoting global connectivity.
NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps
The use of words to convey speaker's intent is traditionally distinguished from the `mention' of words for quoting what someone said, or pointing out properties of a word.
Two Heads are Better than One: Nested PoE for Robust Defense Against Multi-Backdoors
In this paper, we propose Nested Product of Experts(NPoE) defense framework, which involves a mixture of experts (MoE) as a trigger-only ensemble within the PoE defense framework to simultaneously defend against multiple trigger types.
Securing Social Spaces: Harnessing Deep Learning to Eradicate Cyberbullying
In today's digital world, cyberbullying is a serious problem that can harm the mental and physical health of people who use social media.
A Comprehensive Study on NLP Data Augmentation for Hate Speech Detection: Legacy Methods, BERT, and LLMs
The surge of interest in data augmentation within the realm of NLP has been driven by the need to address challenges posed by hate speech domains, the dynamic nature of social media vocabulary, and the demands for large-scale neural networks requiring extensive training data.
Towards Interpretable Hate Speech Detection using Large Language Model-extracted Rationales
Although social media platforms are a prominent arena for users to engage in interpersonal discussions and express opinions, the facade and anonymity offered by social media may allow users to spew hate speech and offensive content.
Exploring Tokenization Strategies and Vocabulary Sizes for Enhanced Arabic Language Models
This paper presents a comprehensive examination of the impact of tokenization strategies and vocabulary sizes on the performance of Arabic language models in downstream natural language processing tasks.
Harnessing Artificial Intelligence to Combat Online Hate: Exploring the Challenges and Opportunities of Large Language Models in Hate Speech Detection
Large language models (LLMs) excel in many diverse applications beyond language generation, e. g., translation, summarization, and sentiment analysis.
Subjective $\textit{Isms}$? On the Danger of Conflating Hate and Offence in Abusive Language Detection
Natural language processing research has begun to embrace the notion of annotator subjectivity, motivated by variations in labelling.
Leveraging Weakly Annotated Data for Hate Speech Detection in Code-Mixed Hinglish: A Feasibility-Driven Transfer Learning Approach with Large Language Models
Zero-shot learning, one-shot learning, and few-shot learning and prompting approaches have then been applied to assign labels to the comments and compare them to human-assigned labels.