48 papers with code • 7 benchmarks • 11 datasets
Hate Speech Detection is the automated task of detecting if a piece of text contains hate speech.
We train a multi-class classifier to distinguish between these different categories.
Additionally, when BERT is trained with bias label for hate speech detection, the prediction score increases, implying that bias and hate are intertwined.
Hate speech is commonly defined as any communication that disparages a target group of people based on some characteristic such as race, colour, ethnicity, gender, sexual orientation, nationality, religion, or other characteristic.
Hate speech detection is a challenging problem with most of the datasets available in only one language: English.
We also observe that models, which utilize the human rationales for training, perform better in reducing unintended bias towards target communities.
Ranked #1 on Hate Speech Detection on HateXplain
Current research on hate speech analysis is typically oriented towards monolingual and single classification tasks.