The CIFAR-100 dataset (Canadian Institute for Advanced Research, 100 classes) is a subset of the Tiny Images dataset and consists of 60000 32x32 color images. The 100 classes in the CIFAR-100 are grouped into 20 superclasses. There are 600 images per class. Each image comes with a "fine" label (the class to which it belongs) and a "coarse" label (the superclass to which it belongs). There are 500 training images and 100 testing images per class.
6,926 PAPERS • 50 BENCHMARKS
WebKB is a dataset that includes web pages from computer science departments of various universities. 4,518 web pages are categorized into 6 imbalanced categories (Student, Faculty, Staff, Department, Course, Project). Additionally there is Other miscellanea category that is not comparable to the rest.
80 PAPERS • 6 BENCHMARKS
The Blackbird unmanned aerial vehicle (UAV) dataset is a large-scale, aggressive indoor flight dataset collected using a custom-built quadrotor platform for use in evaluation of agile perception. The Blackbird dataset contains over 10 hours of flight data from 168 flights over 17 flight trajectories and 5 environments. Each flight includes sensor data from 120Hz stereo and downward-facing photorealistic virtual cameras, 100Hz IMU, motor speed sensors, and 360Hz millimeter-accurate motion capture ground truth. Camera images for each flight were photorealistically rendered using FlightGoggles across a variety of environments to facilitate easy experimentation of high performance perception algorithms.
8 PAPERS • NO BENCHMARKS YET
The UCLA Aerial Event Dataest has been captured by a low-cost hex-rotor with a GoPro camera, which is able to eliminate the high frequency vibration of the camera and hold in air autonomously through a GPS and a barometer. It can also fly 20 ∼ 90m above the ground and stays 5 minutes in air.
4 PAPERS • NO BENCHMARKS YET
The dataset represents data generated from a commonly used model in population genetics. It comprises a matrix of 1,000,000 rows and 9 columns, representing parameters and summaries generated by an infinite-sites coalescent model for genetic variation. The first two columns encode the scaled mutation rate (theta) and scaled recombination rate (rho). The subsequent seven columns are data summaries: number of segregating sites (C1), standard uniform random noise acting as a distractor (C2), pairwise mean number of nucleotidic differences (C3), mean $R^2$ across pairs separated by <10% of the simulated genomic regions (C4), number of distinct haplotypes (C5), frequency of the most common haplotype (C6), number of singleton haplotypes (C7).
2 PAPERS • NO BENCHMARKS YET
The ExBAN dataset: a corpus of NL explanations generated by crowd-sourced participants presented with the task of explaining simple Bayesian Network (BN) graphical representations. These explanations, in a separate collection effort, are rated for clarity and informativeness.
1 PAPER • NO BENCHMARKS YET
When arriving at each state, each observation token gets a coin toss to see whether it will appear in the output observation string. Numbers on the left are indices of observations, numbers on the right are indices of states.
1 PAPER • NO BENCHMARKS YET