DeToxy (DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances)

Introduced by Ghosh et al. in DeToxy: A Large-Scale Multimodal Dataset for Toxicity Classification in Spoken Utterances

DeToxy is a publicly available toxicity annotated dataset for the English language. DeToxy is sourced from various openly available speech databases and consists of over 2 million utterances. The dataset would act as a benchmark for the relatively new and un-explored Spoken Language Processing task of detecting toxicity from spoken utterances and boost further research in this space.

Papers


Paper Code Results Date Stars

Dataset Loaders


No data loaders found. You can submit your data loader here.

Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages