LINEMOD is an RGB+D dataset, which has become a de facto standard benchmark for 6D pose estimation. The dataset contains poorly textured objects in a cluttered scene. The dataset contains 15 object sequences. The images in each object sequence contain multiple objects, however, only one object is annotated with the ground-truth class label, bounding box, and 6D pose. The camera intrinsic matrix is also provided with the dataset.
141 PAPERS • 5 BENCHMARKS
The YCB-Video dataset is a large-scale video dataset for 6D object pose estimation. provides accurate 6D poses of 21 objects from the YCB dataset observed in 92 videos with 133,827 frames.
73 PAPERS • 5 BENCHMARKS
ApolloCar3DT is a dataset that contains 5,277 driving images and over 60K car instances, where each car is fitted with an industry-grade 3D CAD model with absolute model size and semantically labelled keypoints. This dataset is above 20 times larger than PASCAL3D+ and KITTI, the current state-of-the-art.
12 PAPERS • 14 BENCHMARKS
HomebrewedDB is a dataset for 6D pose estimation mainly targeting training from 3D models (both textured and textureless), scalability, occlusions, and changes in light conditions and object appearance. The dataset features 33 objects (17 toy, 8 household and 8 industry-relevant objects) over 13 scenes of various difficulty. It also consists of a set of benchmarks to test various desired detector properties, particularly focusing on scalability with respect to the number of objects and resistance to changing light conditions, occlusions and clutter.
12 PAPERS • NO BENCHMARKS YET
A new dataset with significant occlusions related to object manipulation.
3 PAPERS • NO BENCHMARKS YET
Dataset consist of both real captures from Photoneo PhoXi structured light scanner devices annotated by hand and synthetic samples produced by custom generator. In comparison with existing datasets for 6D pose estimation, some notable differences include:
1 PAPER • 1 BENCHMARK
The SMOT dataset, Single sequence-Multi Objects Training, is collected to represent a practical scenario of collecting training images of new objects in the real world, i.e. a mobile robot with an RGB-D camera collects a sequence of frames while driving around a table to learning multiple objects and tries to recognize objects in different locations.
1 PAPER • NO BENCHMARKS YET