The CIFAR-100 dataset (Canadian Institute for Advanced Research, 100 classes) is a subset of the Tiny Images dataset and consists of 60000 32x32 color images. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. There are 600 images per class. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs). There are 500 training images and 100 testing images per class.
7,672 PAPERS • 52 BENCHMARKS
The Caltech 101 Silhouettes dataset consists of 4,100 training samples, 2,264 validation samples and 2,307 test samples. The datast is based on CalTech 101 image annotations. Each image in the CalTech 101 data set includes a high-quality polygon outline of the primary object in the scene. To create the CalTech 101 Silhouettes data set, the authors center and scale each outline and render it on a DxD pixel image-plane. The outline is rendered as a filled, black polygon on a white background. Many object classes exhibit silhouettes that have distinctive class-specific features. A relatively small number of classes like soccer ball, pizza, stop sign, and yin-yang are indistinguishable based on shape, but have been left-in in the data.
27 PAPERS • NO BENCHMARKS YET
WI-LOCNESS is part of the Building Educational Applications 2019 Shared Task for Grammatical Error Correction. It consists of two datasets:
26 PAPERS • 2 BENCHMARKS
WikiAtomicEdits is a corpus of 43 million atomic edits across 8 languages. These edits are mined from Wikipedia edit history and consist of instances in which a human editor has inserted a single contiguous phrase into, or deleted a single contiguous phrase from, an existing sentence.
9 PAPERS • NO BENCHMARKS YET
HASY is a dataset of single symbols similar to MNIST. It contains 168,233 instances of 369 classes. HASY contains two challenges: A classification challenge with 10 pre-defined folds for 10-fold cross-validation and a verification challenge.
3 PAPERS • NO BENCHMARKS YET