Jigsaw Toxic Comment Classification Dataset

You are provided with a large number of Wikipedia comments which have been labeled by human raters for toxic behavior. The types of toxicity are:

toxic severe_toxic obscene threat insult identity_hate You must create a model which predicts a probability of each type of toxicity for each comment.

File descriptions train.csv - the training set, contains comments with their binary labels test.csv - the test set, you must predict the toxicity probabilities for these comments. To deter hand labeling, the test set contains some comments which are not included in scoring. sample_submission.csv - a sample submission file in the correct format test_labels.csv - labels for the test data; value of -1 indicates it was not used for scoring; (Note: file added after competition close!) Usage The dataset under CC0, with the underlying comment text being governed by Wikipedia's CC-SA-3.0

Homepage

Benchmarks

Add a new result Link an existing benchmark

Trend	Task	Dataset Variant	Best Model	Paper	Code
	Toxic Comment Classification	Jigsaw Toxic Comment Classification Dataset	CapsNet

Papers

Paper	Code	Results	Date	Stars

Dataset Loaders

Add Remove

No data loaders found. You can submit your data loader here.

Tasks

Toxic Comment Classification

Usage

Jigsaw Toxic Comment Classification Dataset

Benchmarks Edit Add a new result Link an existing benchmark

Papers

Dataset Loaders Edit Add Remove

Tasks Edit

Usage

License Edit

Modalities Edit

Languages Edit

Benchmarks

Add a new result Link an existing benchmark

Dataset Loaders

Add Remove

Tasks

License

Modalities

Languages