Hate Speech Detection
164 papers with code • 14 benchmarks • 39 datasets
Hate speech detection is the task of detecting if communication such as text, audio, and so on contains hatred and or encourages violence towards a person or a group of people. This is usually based on prejudice against 'protected characteristics' such as their ethnicity, gender, sexual orientation, religion, age et al. Some example benchmarks are ETHOS and HateXplain. Models can be evaluated with metrics like the F-score or F-measure.
Libraries
Use these libraries to find Hate Speech Detection models and implementationsDatasets
Subtasks
Latest papers
Exploring the Trade-off Between Model Performance and Explanation Plausibility of Text Classifiers Using Human Rationales
By leveraging a multi-objective optimization algorithm, we explore the trade-off between the two loss functions and generate a Pareto-optimal frontier of models that balance performance and plausibility.
NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data
To address the global issue of hateful content proliferating in online platforms, hate speech detection (HSD) models are typically developed on datasets collected in the United States, thereby failing to generalize to English dialects from the Majority World.
Improving Adversarial Data Collection by Supporting Annotators: Lessons from GAHD, a German Hate Speech Dataset
Our experiments show that the resulting dataset is challenging even for state-of-the-art hate speech detection models, and that training on GAHD clearly improves model robustness.
CLIcK: A Benchmark Dataset of Cultural and Linguistic Intelligence in Korean
Despite the rapid development of large language models (LLMs) for the Korean language, there remains an obvious lack of benchmark datasets that test the requisite Korean cultural and linguistic knowledge.
GPT-HateCheck: Can LLMs Write Better Functional Tests for Hate Speech Detection?
A recent proposal in this direction is HateCheck, a suite for testing fine-grained model functionalities on synthesized data generated using templates of the kind "You are just a [slur] to me."
Bryndza at ClimateActivism 2024: Stance, Target and Hate Event Detection via Retrieval-Augmented GPT-4 and LLaMA
This study details our approach for the CASE 2024 Shared Task on Climate Activism Stance and Hate Event Detection, focusing on Hate Speech Detection, Hate Speech Target Identification, and Stance Detection as classification challenges.
Probing Critical Learning Dynamics of PLMs for Hate Speech Detection
Despite the widespread adoption, there is a lack of research into how various critical aspects of pretrained language models (PLMs) affect their performance in hate speech detection.
Attentive Fusion: A Transformer-based Approach to Multimodal Hate Speech Detection
With the recent surge and exponential growth of social media usage, scrutinizing social media content for the presence of any hateful content is of utmost importance.
MetaHate: A Dataset for Unifying Efforts on Hate Speech Detection
Hate speech represents a pervasive and detrimental form of online discourse, often manifested through an array of slurs, from hateful tweets to defamatory posts.
TuPy-E: detecting hate speech in Brazilian Portuguese social media with a novel dataset and comprehensive analysis of models
Social media has become integral to human interaction, providing a platform for communication and expression.