10 dataset results for Depth Completion

KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute) is one of the most popular datasets for use in mobile robotics and autonomous driving. It consists of hours of traffic scenarios recorded with a variety of sensor modalities, including high-resolution RGB, grayscale stereo cameras, and a 3D laser scanner. Despite its popularity, the dataset itself does not contain ground truth for semantic segmentation. However, various researchers have manually annotated parts of the dataset to fit their necessities. Álvarez et al. generated ground truth for 323 images from the road detection challenge with three classes: road, vertical, and sky. Zhang et al. annotated 252 (140 for training and 112 for testing) acquisitions – RGB and Velodyne scans – from the tracking challenge for ten object categories: building, sky, road, vegetation, sidewalk, car, pedestrian, cyclist, sign/pole, and fence. Ros et al. labeled 170 training images and 46 testing images (from the visual odome

3,219 PAPERS • 141 BENCHMARKS

NYUv2 (NYU-Depth V2)

The NYU-Depth V2 data set is comprised of video sequences from a variety of indoor scenes as recorded by both the RGB and Depth cameras from the Microsoft Kinect. It features:

841 PAPERS • 20 BENCHMARKS

Matterport3D

The Matterport3D dataset is a large RGB-D dataset for scene understanding in indoor environments. It contains 10,800 panoramic views inside 90 real building-scale scenes, constructed from 194,400 RGB-D images. Each scene is a residential building consisting of multiple rooms and floor levels, and is annotated with surface construction, camera poses, and semantic segmentation.

379 PAPERS • 5 BENCHMARKS

VOID (Visual Odometry with Inertial and Depth)

The dataset was collected using the Intel RealSense D435i camera, which was configured to produce synchronized accelerometer and gyroscope measurements at 400 Hz, along with synchronized VGA-size (640 x 480) RGB and depth streams at 30 Hz. The depth frames are acquired using active stereo and is aligned to the RGB frame using the sensor factory calibration. All the measurements are timestamped.

19 PAPERS • 1 BENCHMARK

KITTI-Depth

The KITTI-Depth dataset includes depth maps from projected LiDAR point clouds that were matched against the depth estimation from the stereo cameras. The depth images are highly sparse with only 5% of the pixels available and the rest is missing. The dataset has 86k training images, 7k validation images, and 1k test set images on the benchmark server with no access to the ground truth.

14 PAPERS • NO BENCHMARKS YET

TransCG

TransCG is the first large-scale real-world dataset for transparent object depth completion and grasping, which contains 57,715 RGB-D images of 51 transparent objects and many opaque objects captured from different perspectives (~240 viewpoints) of 130 scenes under real-world settings. The samples are captured by two different types of cameras (Realsense D435 & L515).

4 PAPERS • 1 BENCHMARK

SuperCaustics

SuperCaustics is a simulation tool made in Unreal Engine for generating massive computer vision datasets that include transparent objects.

2 PAPERS • 1 BENCHMARK

BIDCD (Bosch Industrial Depth Completion Dataset)

Bosch Industrial Depth Completion Dataset (BIDCD) is an RGBD dataset for of static table-top scenes with industrial objects. The data was collected with a RealSense depth-camera mounted on a robotic arm, i.e. from multiple Points-of-View (POV), approximately 60 for each scene. We generated depth ground truth with a customized pipeline for removing erroneous depth values, and applied Multi-View geometry to fuse the cleaned depth frames and fill-in missing information. The fused scene mesh was back-projected to each POV, and finally a bi-lateral filter was applied to reduce the remaining holes.

1 PAPER • NO BENCHMARKS YET

PLAD (Point Line and Depth dataset)

PLAD is a dataset where sparse depth is provided by line-based visual SLAM to verify StructMDC.

1 PAPER • 1 BENCHMARK

Multi-Spectral Stereo Dataset (RGB, NIR, thermal images, LiDAR, GPS/IMU)

Abstract: We introduce the multi-spectral stereo (MS2) outdoor dataset, including stereo RGB, stereo NIR, stereo thermal, stereo LiDAR data, and GPS/IMU information. Our dataset provides rectified and synchronized 184K data pairs taken from city, residential, road, campus, and suburban areas in the morning, daytime, and nighttime under clear-sky, cloudy, and rainy conditions. We designed the dataset to explore various computer vision algorithms from multi-spectral sensor data to achieve high-level performance, reliability, and robustness against challenging environments.

0 PAPER • NO BENCHMARKS YET

Datasets

10 dataset results for Depth Completion