ScanNet is an instance-level indoor RGB-D dataset that includes both 2D and 3D data. It is a collection of labeled voxels rather than points or objects. Up to now, ScanNet v2, the newest version of ScanNet, has collected 1513 annotated scans with an approximate 90% surface coverage. In the semantic segmentation task, this dataset is marked in 20 classes of annotated 3D voxelized objects.
649 PAPERS • 14 BENCHMARKS
The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect. It features:
587 PAPERS • 17 BENCHMARKS
The PASCAL Context dataset is an extension of the PASCAL VOC 2010 detection challenge, and it contains pixel-wise labels for all training images. It contains more than 400 classes (including the original 20 classes plus backgrounds from PASCAL VOC segmentation), divided into three categories (objects, stuff, and hybrids). Many of the object categories of this dataset are too sparse and; therefore, a subset of 59 frequent classes are usually selected for use.
182 PAPERS • 6 BENCHMARKS
Taskonomy provides a large and high-quality dataset of varied indoor scenes.
92 PAPERS • 2 BENCHMARKS
The General Robust Image Task (GRIT) Benchmark is an evaluation-only benchmark for evaluating the performance and robustness of vision systems across multiple image prediction tasks, concepts, and data sources. GRIT hopes to encourage our research community to pursue the following research directions:
8 PAPERS • 7 BENCHMARKS
Pano3D is a new benchmark for depth estimation from spherical panoramas. Its goal is to drive progress for this task in a consistent and holistic manner. The Pano3D 360 depth estimation benchmark provides a standard Matterport3D train and test split, as well as a secondary GibsonV2 partioning for testing and training as well. The latter is used for zero-shot cross dataset transfer performance assessment and decomposes it into 3 different splits, each one focusing on a specific generalization axis.
1 PAPER • NO BENCHMARKS YET
SuperCaustics is a simulation tool made in Unreal Engine for generating massive computer vision datasets that include transparent objects.
1 PAPER • 1 BENCHMARK
The dataset contains procedurally generated images of transparent vessels containing liquid and objects. The data for each image includes segmentation maps, 3d depth maps, and normal maps of the transparent vessel, the liquid or object inside the vessel, and the environment. In addition, the properties of the vessel and materials inside it are given(color/transparency/roughness/metalness). 3d models of the objects (GTLF) are also supplied. Camera parameters and position are also included.
1 PAPER • 1 BENCHMARK