Texts

TuPyE-Dataset (Portuguese Hate Speech Expanded Dataset)

Introduced by Oliveira et al. in TuPy-E: detecting hate speech in Brazilian Portuguese social media with a novel dataset and comprehensive analysis of models

TuPyE, an enhanced iteration of TuPy, encompasses a compilation of 43,668 meticulously annotated documents specifically selected for the purpose of hate speech detection within diverse social network contexts. This augmented dataset integrates supplementary annotations and amalgamates with datasets sourced from Fortuna et al. (2019), Leite et al. (2020), and Vargas et al. (2022), complemented by an infusion of 10,000 original documents from the TuPy-Dataset.

In light of the constrained availability of annotated data in Portuguese pertaining to the English language, TuPyE is committed to the expansion and enhancement of existing datasets. This augmentation serves to facilitate the development of advanced hate speech detection models through the utilization of machine learning (ML) and natural language processing (NLP) techniques.

Homepage

Benchmarks

Add a new result Link an existing benchmark

No benchmarks yet. Start a new benchmark or link an existing one.

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

Silly-Machine/TuPyE-Dataset

Tasks

Usage

License

cc-by-4.0

Modalities

Texts

Languages

Portuguese

TuPyE-Dataset (Portuguese Hate Speech Expanded Dataset)

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit