Hate Speech Detection

164 papers with code • 14 benchmarks • 39 datasets

Hate speech detection is the task of detecting if communication such as text, audio, and so on contains hatred and or encourages violence towards a person or a group of people. This is usually based on prejudice against 'protected characteristics' such as their ethnicity, gender, sexual orientation, religion, age et al. Some example benchmarks are ETHOS and HateXplain. Models can be evaluated with metrics like the F-score or F-measure.

Libraries

Use these libraries to find Hate Speech Detection models and implementations

Latest papers with no code

Cross-Platform Hate Speech Detection with Weakly Supervised Causal Disentanglement

no code yet • 17 Apr 2024

Content moderation faces a challenging task as social media's ability to spread hate speech contrasts with its role in promoting global connectivity.

NLP Systems That Can't Tell Use from Mention Censor Counterspeech, but Teaching the Distinction Helps

no code yet • 2 Apr 2024

The use of words to convey speaker's intent is traditionally distinguished from the `mention' of words for quoting what someone said, or pointing out properties of a word.

Two Heads are Better than One: Nested PoE for Robust Defense Against Multi-Backdoors

no code yet • 2 Apr 2024

In this paper, we propose Nested Product of Experts(NPoE) defense framework, which involves a mixture of experts (MoE) as a trigger-only ensemble within the PoE defense framework to simultaneously defend against multiple trigger types.

Securing Social Spaces: Harnessing Deep Learning to Eradicate Cyberbullying

no code yet • 1 Apr 2024

In today's digital world, cyberbullying is a serious problem that can harm the mental and physical health of people who use social media.

A Comprehensive Study on NLP Data Augmentation for Hate Speech Detection: Legacy Methods, BERT, and LLMs

no code yet • 30 Mar 2024

The surge of interest in data augmentation within the realm of NLP has been driven by the need to address challenges posed by hate speech domains, the dynamic nature of social media vocabulary, and the demands for large-scale neural networks requiring extensive training data.

Towards Interpretable Hate Speech Detection using Large Language Model-extracted Rationales

no code yet • 19 Mar 2024

Although social media platforms are a prominent arena for users to engage in interpersonal discussions and express opinions, the facade and anonymity offered by social media may allow users to spew hate speech and offensive content.

Exploring Tokenization Strategies and Vocabulary Sizes for Enhanced Arabic Language Models

no code yet • 17 Mar 2024

This paper presents a comprehensive examination of the impact of tokenization strategies and vocabulary sizes on the performance of Arabic language models in downstream natural language processing tasks.

Harnessing Artificial Intelligence to Combat Online Hate: Exploring the Challenges and Opportunities of Large Language Models in Hate Speech Detection

no code yet • 12 Mar 2024

Large language models (LLMs) excel in many diverse applications beyond language generation, e. g., translation, summarization, and sentiment analysis.

Subjective $\textit{Isms}$? On the Danger of Conflating Hate and Offence in Abusive Language Detection

no code yet • 4 Mar 2024

Natural language processing research has begun to embrace the notion of annotator subjectivity, motivated by variations in labelling.

Leveraging Weakly Annotated Data for Hate Speech Detection in Code-Mixed Hinglish: A Feasibility-Driven Transfer Learning Approach with Large Language Models

no code yet • 4 Mar 2024

Zero-shot learning, one-shot learning, and few-shot learning and prompting approaches have then been applied to assign labels to the comments and compare them to human-assigned labels.