A large-scale and machine-generated dataset of 274,186 toxic and benign statements about 13 minority groups.

This dataset uses a demonstration-based prompting framework and an adversarial classifier-in-the-loop decoding method to generate subtly toxic and benign text with a massive pre-trained language model (GPT-3). Controlling machine generation in this way allows TOXIGEN to cover implicitly toxic text at a larger scale, and about more demographic groups, than previous resources of human-written text. TOXIGEN can be used to fight human-written and machine-generated toxicity.

Papers


Paper Code Results Date Stars

Dataset Loaders


Tasks


Similar Datasets


License


  • Unknown

Modalities


Languages