Fashion-MNIST is a dataset comprising of 28×28 grayscale images of 70,000 fashion products from 10 categories, with 7,000 images per category. The training set has 60,000 images and the test set has 10,000 images. Fashion-MNIST shares the same image size, data format and the structure of training and testing splits with the original MNIST.
2,796 PAPERS • 17 BENCHMARKS
Office-Home is a benchmark dataset for domain adaptation which contains 4 domains where each domain consists of 65 categories. The four domains are: Art – artistic images in the form of sketches, paintings, ornamentation, etc.; Clipart – collection of clipart images; Product – images of objects without a background and Real-World – images of objects captured with a regular camera. It contains 15,500 images, with an average of around 70 images per class and a maximum of 99 images in a class.
943 PAPERS • 11 BENCHMARKS
DomainNet is a dataset of common objects in six different domain. All domains include 345 categories (classes) of objects such as Bracelet, plane, bird and cello. The domains include clipart: collection of clipart images; real: photos and real world images; sketch: sketches of specific objects; infograph: infographic images with specific object; painting artistic depictions of objects in the form of paintings and quickdraw: drawings of the worldwide players of game “Quick Draw!”.
612 PAPERS • 10 BENCHMARKS
PACS is an image dataset for domain generalization. It consists of four domains, namely Photo (1,670 images), Art Painting (2,048 images), Cartoon (2,344 images) and Sketch (3,929 images). Each domain contains seven categories.
568 PAPERS • 7 BENCHMARKS
ImageNet-C is an open source data set that consists of algorithmically generated corruptions (blur, noise) applied to the ImageNet test-set.
513 PAPERS • 3 BENCHMARKS
ImageNet-R(endition) contains art, cartoons, deviantart, graffiti, embroidery, graphics, origami, paintings, patterns, plastic objects, plush objects, sculptures, sketches, tattoos, toys, and video game renditions of ImageNet classes.
346 PAPERS • 6 BENCHMARKS
The ImageNet-A dataset consists of real-world, unmodified, and naturally occurring examples that are misclassified by ResNet models.
321 PAPERS • 5 BENCHMARKS
ImageNet-Sketch data set consists of 50,889 images, approximately 50 images for each of the 1000 ImageNet classes. The data set is constructed with Google Image queries "sketch of ", where is the standard class name. Only within the "black and white" color scheme is searched. 100 images are initially queried for every class, and the pulled images are cleaned by deleting the irrelevant images and images that are for similar but different classes. For some classes, there are less than 50 images after manually cleaning, and then the data set is augmented by flipping and rotating the images.
205 PAPERS • 3 BENCHMARKS
The Stylized-ImageNet dataset is created by removing local texture cues in ImageNet while retaining global shape information on natural images via AdaIN style transfer. This nudges CNNs towards learning more about shapes and less about local textures.
102 PAPERS • 1 BENCHMARK
Our goal is to improve upon the status quo for designing image classification models trained in one domain that perform well on images from another domain. Complementing existing work in robustness testing, we introduce the first test dataset for this purpose which comes from an authentic use case where photographers wanted to learn about the content in their images. We built a new test set using 8,900 images taken by people who are blind for which we collected metadata to indicate the presence versus absence of 200 ImageNet object categories. We call this dataset VizWiz-Classification.
21 PAPERS • 3 BENCHMARKS
The MSK dataset is a dataset for lesion recognition from the Memorial Sloan-Kettering Cancer Center. It is used as part of the ISIC lesion recognition challenges.
12 PAPERS • NO BENCHMARKS YET
Wild-Time is a benchmark of 5 datasets that reflect temporal distribution shifts arising in a variety of real-world applications, including patient prognosis and news classification. On these datasets, we systematically benchmark 13 prior approaches, including methods in domain generalization, continual learning, self-supervised learning, and ensemble learning.
4 PAPERS • NO BENCHMARKS YET
We design an all-day semantic segmentation benchmark all-day CityScapes. It is the first semantic segmentation benchmark that contains samples from all-day scenarios, i.e., from dawn to night. Our dataset will be made publicly available at [https://isis-data.science.uva.nl/cv/1ADcityscape.zip].
3 PAPERS • 1 BENCHMARK