The CIFAR-10 dataset (Canadian Institute for Advanced Research, 10 classes) is a subset of the Tiny Images dataset and consists of 60000 32x32 color images. The images are labelled with one of 10 mutually exclusive classes: airplane, automobile (but not truck or pickup truck), bird, cat, deer, dog, frog, horse, ship, and truck (but not pickup truck). There are 6000 images per class with 5000 training and 1000 testing images per class.
10,443 PAPERS • 66 BENCHMARKS
WSJ0-2mix is a speech recognition corpus of speech mixtures using utterances from the Wall Street Journal (WSJ0) corpus.
118 PAPERS • 2 BENCHMARKS
ImageNet-P consists of noise, blur, weather, and digital distortions. The dataset has validation perturbations; has difficulty levels; has CIFAR-10, Tiny ImageNet, ImageNet 64 × 64, standard, and Inception-sized editions; and has been designed for benchmarking not training networks. ImageNet-P departs from ImageNet-C by having perturbation sequences generated from each ImageNet validation image. Each sequence contains more than 30 frames, so to counteract an increase in dataset size and evaluation time only 10 common perturbations are used.
21 PAPERS • 1 BENCHMARK
NAS-Bench-1Shot1 draws on the recent large-scale tabular benchmark NAS-Bench-101 for cheap anytime evaluations of one-shot NAS methods.
20 PAPERS • NO BENCHMARKS YET
comma 2k19 is a dataset of over 33 hours of commute in California's 280 highway. This means 2019 segments, 1 minute long each, on a 20km section of highway driving between California's San Jose and San Francisco. The dataset was collected using comma EONs that have sensors similar to those of any modern smartphone including a road-facing camera, phone GPS, thermometers and a 9-axis IMU.
9 PAPERS • NO BENCHMARKS YET
The PointDenoisingBenchmark dataset features 28 different shapes, split into 18 training shapes and 10 test shapes.
1 PAPER • NO BENCHMARKS YET
REAP is a digital benchmark that allows the user to evaluate patch attacks on real images, and under real-world conditions. Built on top of the Mapillary Vistas dataset, the benchmark contains over 14,000 traffic signs. Each sign is augmented with a pair of geometric and lighting transformations, which can be used to apply a digitally generated patch realistically onto the sign.
Text Classification Attack Benchmark (TCAB) is a dataset for analyzing, understanding, detecting, and labeling adversarial attacks against text classifiers. TCAB includes 1.5 million attack instances, generated by twelve adversarial attack targeting three classifiers trained on six source datasets for sentiment analysis and abuse detection in English. The process of generating attacks is automated, so that TCAB can easily be extended to incorporate new text attacks and better classifiers as they are developed.