The SUN RGBD dataset contains 10335 real RGB-D images of room scenes. Each RGB image has a corresponding depth and segmentation map. As many as 700 object categories are labeled. The training and testing sets contain 5285 and 5050 images, respectively.
208 PAPERS • 8 BENCHMARKS
ZINC is a free database of commercially-available compounds for virtual screening. ZINC contains over 230 million purchasable compounds in ready-to-dock, 3D formats. ZINC also contains over 750 million purchasable compounds that can be searched for analogs.
80 PAPERS • 4 BENCHMARKS
R2R is a dataset for visually-grounded natural language navigation in real buildings. The dataset requires autonomous agents to follow human-generated navigation instructions in previously unseen buildings, as illustrated in the demo above. For training, each instruction is associated with a Matterport3D Simulator trajectory. 22k instructions are available, with an average length of 29 words. There is a test evaluation server for this dataset available at EvalAI.
57 PAPERS • 2 BENCHMARKS
The NCLT dataset is a large scale, long-term autonomy dataset for robotics research collected on the University of Michigan’s North Campus. The dataset consists of omnidirectional imagery, 3D lidar, planar lidar, GPS, and proprioceptive sensors for odometry collected using a Segway robot. The dataset was collected to facilitate research focusing on long-term autonomous operation in changing environments. The dataset is comprised of 27 sessions spaced approximately biweekly over the course of 15 months. The sessions repeatedly explore the campus, both indoors and outdoors, on varying trajectories, and at different times of the day across all four seasons. This allows the dataset to capture many challenging elements including: moving obstacles (e.g., pedestrians, bicyclists, and cars), changing lighting, varying viewpoint, seasonal and weather changes (e.g., falling leaves and snow), and long-term structural changes caused by construction projects.
41 PAPERS • NO BENCHMARKS YET
The MAESTRO dataset contains over 200 hours of paired audio and MIDI recordings from ten years of International Piano-e-Competition. The MIDI data includes key strike velocities and sustain/sostenuto/una corda pedal positions. Audio and MIDI files are aligned with ∼3 ms accuracy and sliced to individual musical pieces, which are annotated with composer, title, and year of performance. Uncompressed audio is of CD quality or higher (44.1–48 kHz 16-bit PCM stereo).
32 PAPERS • NO BENCHMARKS YET
Jericho is a learning environment for man-made Interactive Fiction (IF) games.
15 PAPERS • NO BENCHMARKS YET
The Kumar dataset contains 30 1,000×1,000 image tiles from seven organs (6 breast, 6 liver, 6 kidney, 6 prostate, 2 bladder, 2 colon and 2 stomach) of The Cancer Genome Atlas (TCGA) database acquired at 40× magnification. Within each image, the boundary of each nucleus is fully annotated.
11 PAPERS • 1 BENCHMARK
The NYU Hand pose dataset contains 8252 test-set and 72757 training-set frames of captured RGBD data with ground-truth hand-pose information. For each frame, the RGBD data from 3 Kinects is provided: a frontal view and 2 side views. The training set contains samples from a single user only (Jonathan Tompson), while the test set contains samples from two users (Murphy Stein and Jonathan Tompson). A synthetic re-creation (rendering) of the hand pose is also provided for each view.
10 PAPERS • 1 BENCHMARK
The Watch-n-Patch dataset was created with the focus on modeling human activities, comprising multiple actions in a completely unsupervised setting. It is collected with Microsoft Kinect One sensor for a total length of about 230 minutes, divided in 458 videos. 7 subjects perform human daily activities in 8 offices and 5 kitchens with complex backgrounds. Moreover, skeleton data are provided as ground truth annotations.
9 PAPERS • NO BENCHMARKS YET
DCASE2018 Task 4 is a dataset for large-scale weakly labeled semi-supervised sound event detection in domestic environments. The data are YouTube video excerpts focusing on domestic context which could be used for example in ambient assisted living applications. The domain was chosen due to the scientific challenges (wide variety of sounds, time-localized events...) and potential industrial applications. Specifically, the task employs a subset of “Audioset: An Ontology And Human-Labeled Dataset For Audio Events” by Google. Audioset consists of an expanding ontology of 632 sound event classes and a collection of 2 million human-labeled 10-second sound clips (less than 21% are shorter than 10-seconds) drawn from 2 million Youtube videos. The ontology is specified as a hierarchical graph of event categories, covering a wide range of human and animal sounds, musical instruments and genres, and common everyday environmental sounds. Task 4 focuses on a subset of Audioset that consists of 10
8 PAPERS • NO BENCHMARKS YET
Synbols is a dataset generator designed for probing the behavior of learning algorithms. By defining the distribution over latent factors one can craft a dataset specifically tailored to answer specific questions about a given algorithm.
5 PAPERS • NO BENCHMARKS YET
The Daimler Monocular Pedestrian Detection dataset is a dataset for pedestrian detection in urban environments. The training set contains 15560 pedestrian samples (image cut-outs at 48×96 resolution) and 6744 additional full images without pedestrians for extracting negative samples. The test set contains an independent sequence with more than 21790 images and 56492 pedestrian labels (fully visible or partially occluded), captured from a vehicle during a 27 min driving through the urban traffic.
1 PAPER • NO BENCHMARKS YET
The datasets includes curves drawn on 3D surfaces (triangle meshes) in Virtual Reality. A total of 2,880 curves were created using two different techniques by 20 users on 6 meshes. For each curve, a 3D curve executed by the user is provided, the projected curve created on the mesh, and the ground truth target curve on the mesh. For collecting the data, two different task types were employed, which are described in the paper.
1 PAPER • NO BENCHMARKS YET