7 dataset results for Monocular Depth Estimation AND Stereo

Middlebury 2014

The Middlebury 2014 dataset contains a set of 23 high resolution stereo pairs for which known camera calibration parameters and ground truth disparity maps obtained with a structured light scanner are available. The images in the Middlebury dataset all show static indoor scenes with varying difficulties including repetitive structures, occlusions, wiry objects as well as untextured areas.

51 PAPERS • 2 BENCHMARKS

3D Ken Burns

This dataset accompanies our paper on synthesizing the 3D Ken Burns effect from a single image. It consists of 134041 captures from 32 virtual environments where each capture consists of 4 views. Each view contains color-, depth-, and normal-maps at a resolution of 512x512 pixels.

13 PAPERS • NO BENCHMARKS YET

WSVD (Web Stereo Video Dataset)

The Web Stereo Video Dataset consists of 553 stereoscopic videos from YouTube. This dataset has a wide variety of scene types, and features many nonrigid objects.

12 PAPERS • NO BENCHMARKS YET

Holopix50k

An in-the-wild stereo image dataset, comprising 49,368 image pairs contributed by users of the Holopix mobile social platform.

9 PAPERS • NO BENCHMARKS YET

UASOL (A large-scale high-resolution outdoor stereo dataset)

The UASOL an RGB-D stereo dataset, that contains 160902 frames, filmed at 33 different scenes, each with between 2 k and 10 k frames. The frames show different paths from the perspective of a pedestrian, including sidewalks, trails, roads, etc. The images were extracted from video files with 15 fps at HD2K resolution with a size of 2280 × 1282 pixels. The dataset also provides a GPS geolocalization tag for each second of the sequences and reflects different climatological conditions. It also involved up to 4 different persons filming the dataset at different moments of the day.

3 PAPERS • 1 BENCHMARK

VA (Virtual Apartment)

A synthetic depth estimation dataset for benchmark rendered from a high-quality CAD indoor environment

3 PAPERS • 1 BENCHMARK

InfraParis

InfraParis is a novel and versatile dataset supporting multiple tasks across three modalities: RGB, depth, and infrared. From the city to the suburbs, it contains a variety of styles in different areas of the greater Paris area, providing rich semantic information. InfraParis contains 7301 images with bounding boxes and full semantic (19 classes) annotations. We assess various state-of-the-art baseline techniques, encompassing models for the tasks of semantic segmentation, object detection, and depth estimation.

1 PAPER • NO BENCHMARKS YET

Datasets

7 dataset results for Monocular Depth Estimation AND Stereo