9 dataset results for Small Data Image Classification

CIFAR-10 (Canadian Institute for Advanced Research, 10 classes)

The CIFAR-10 dataset (Canadian Institute for Advanced Research, 10 classes) is a subset of the Tiny Images dataset and consists of 60000 32x32 color images. The images are labelled with one of 10 mutually exclusive classes: airplane, automobile (but not truck or pickup truck), bird, cat, deer, dog, frog, horse, ship, and truck (but not pickup truck). There are 6000 images per class with 5000 training and 1000 testing images per class.

14,132 PAPERS • 98 BENCHMARKS

CIFAR-100

The CIFAR-100 dataset (Canadian Institute for Advanced Research, 100 classes) is a subset of the Tiny Images dataset and consists of 60000 32x32 color images. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. There are 600 images per class. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs). There are 500 training images and 100 testing images per class.

7,683 PAPERS • 52 BENCHMARKS

CUB-200-2011 (Caltech-UCSD Birds-200-2011)

The Caltech-UCSD Birds-200-2011 (CUB-200-2011) dataset is the most widely-used dataset for fine-grained visual categorization task. It contains 11,788 images of 200 subcategories belonging to birds, 5,994 for training and 5,794 for testing. Each image has detailed annotations: 1 subcategory label, 15 part locations, 312 binary attributes and 1 bounding box. The textual information comes from Reed et al.. They expand the CUB-200-2011 dataset by collecting fine-grained natural language descriptions. Ten single-sentence descriptions are collected for each image. The natural language descriptions are collected through the Amazon Mechanical Turk (AMT) platform, and are required at least 10 words, without any information of subcategories and actions.

1,963 PAPERS • 44 BENCHMARKS

UCF-Crime

The UCF-Crime dataset is a large-scale dataset of 128 hours of videos. It consists of 1900 long and untrimmed real-world surveillance videos, with 13 realistic anomalies including Abuse, Arrest, Arson, Assault, Road Accident, Burglary, Explosion, Fighting, Robbery, Shooting, Stealing, Shoplifting, and Vandalism. These anomalies are selected because they have a significant impact on public safety.

108 PAPERS • 1 BENCHMARK

TMED (Tufts Medical Echocardiogram Dataset)

TMED is a clinically-motivated benchmark dataset for computer vision and machine learning from limited labeled data.

8 PAPERS • NO BENCHMARKS YET

Ecoli

The Ecoli dataset is a dataset for protein localization. It contains 336 E.coli proteins split into 8 different classes.

7 PAPERS • NO BENCHMARKS YET

DEIC Benchmark (Data-Efficient Image Classification Benchmark)

DEIC is a benchmark for measuring the data efficiency of models in the context of image classification. It is composed of 6 datasets that contain a small number of training samples per class (i.e., 30 < x < 80). It covers multiple image domains (i.e., natural images, fine-grained recognition, medical images, remote sensing, handwriting recognition) and data types (i.e., RGB, grayscale, multi-spectral).

3 PAPERS • 5 BENCHMARKS

ImageNet 50 samples per class

This ImageNet version contains only 50 training images per class while the original testing set remains unchanged. It is one of the datasets comprising the data-efficient image classification (DEIC) benchmark. It was proposed to challenge the generalization capabilities of modern image classifiers.

1 PAPER • 1 BENCHMARK

WikiChurches

WikiChurches is a dataset for architectural style classification, consisting of 9,485 images of church buildings. Both images and style labels were sourced from Wikipedia. The dataset can serve as a benchmark for various research fields, as it combines numerous real-world challenges: fine-grained distinctions between classes based on subtle visual features, a comparatively small sample size, a highly imbalanced class distribution, a high variance of viewpoints, and a hierarchical organization of labels, where only some images are labeled at the most precise level.

1 PAPER • NO BENCHMARKS YET

Datasets

9 dataset results for Small Data Image Classification