🔔 Share your dataset with the ML community!

Filter by Modality

Filter by Task (clear)

Filter by Language

21 dataset results for Visual Odometry

KITTI (Karlsruhe Institute of Technology and Toyota Technological Institute) is one of the most popular datasets for use in mobile robotics and autonomous driving. It consists of hours of traffic scenarios recorded with a variety of sensor modalities, including high-resolution RGB, grayscale stereo cameras, and a 3D laser scanner. Despite its popularity, the dataset itself does not contain ground truth for semantic segmentation. However, various researchers have manually annotated parts of the dataset to fit their necessities. Álvarez et al. generated ground truth for 323 images from the road detection challenge with three classes: road, vertical, and sky. Zhang et al. annotated 252 (140 for training and 112 for testing) acquisitions – RGB and Velodyne scans – from the tracking challenge for ten object categories: building, sky, road, vegetation, sidewalk, car, pedestrian, cyclist, sign/pole, and fence. Ros et al. labeled 170 training images and 46 testing images (from the visual odome

3,219 PAPERS • 141 BENCHMARKS

TUM RGB-D

TUM RGB-D is an RGB-D dataset. It contains the color and depth images of a Microsoft Kinect sensor along the ground-truth trajectory of the sensor. The data was recorded at full frame rate (30 Hz) and sensor resolution (640x480). The ground-truth trajectory was obtained from a high-accuracy motion-capture system with eight high-speed tracking cameras (100 Hz).

189 PAPERS • NO BENCHMARKS YET

Virtual KITTI

Virtual KITTI is a photo-realistic synthetic video dataset designed to learn and evaluate computer vision models for several video understanding tasks: object detection and multi-object tracking, scene-level and instance-level semantic segmentation, optical flow, and depth estimation.

120 PAPERS • 1 BENCHMARK

TartanAir

A dataset for robot navigation task and more. The data is collected in photo-realistic simulation environments in the presence of various light conditions, weather and moving objects.

68 PAPERS • NO BENCHMARKS YET

Event-Camera Dataset

The Event-Camera Dataset is a collection of datasets with an event-based camera for high-speed robotics. The data also include intensity images, inertial measurements, and ground truth from a motion-capture system. An event-based camera is a revolutionary vision sensor with three key advantages: a measurement rate that is almost 1 million times faster than standard cameras, a latency of 1 microsecond, and a high dynamic range of 130 decibels (standard cameras only have 60 dB). These properties enable the design of a new class of algorithms for high-speed robotics, where standard cameras suffer from motion blur and high latency. All the data are released both as text files and binary (i.e., rosbag) files.

44 PAPERS • 2 BENCHMARKS

Virtual KITTI 2

Virtual KITTI 2 is an updated version of the well-known Virtual KITTI dataset which consists of 5 sequence clones from the KITTI tracking benchmark. In addition, the dataset provides different variants of these sequences such as modified weather conditions (e.g. fog, rain) or modified camera configurations (e.g. rotated by 15◦). For each sequence we provide multiple sets of images containing RGB, depth, class segmentation, instance segmentation, flow, and scene flow data. Camera parameters and poses as well as vehicle locations are available as well. In order to showcase some of the dataset’s capabilities, we ran multiple relevant experiments using state-of-the-art algorithms from the field of autonomous driving. The dataset is available for download at https://europe.naverlabs.com/Research/Computer-Vision/Proxy-Virtual-Worlds.

33 PAPERS • 1 BENCHMARK

New College

The New College Data is a freely available dataset collected from a robot completing several loops outdoors around the New College campus in Oxford. The data includes odometry, laser scan, and visual information. The dataset URL is not working anymore.

16 PAPERS • NO BENCHMARKS YET

4Seasons

4Seasons is adataset covering seasonal and challenging perceptual conditions for autonomous driving.

13 PAPERS • NO BENCHMARKS YET

TUM monoVO

TUM monoVO is a dataset for evaluating the tracking accuracy of monocular Visual Odometry (VO) and SLAM methods. It contains 50 real-world sequences comprising over 100 minutes of video, recorded across different environments – ranging from narrow indoor corridors to wide outdoor scenes. All sequences contain mostly exploring camera motion, starting and ending at the same position: this allows to evaluate tracking accuracy via the accumulated drift from start to end, without requiring ground-truth for the full sequence. In contrast to existing datasets, all sequences are photometrically calibrated: the dataset creators provide the exposure times for each frame as reported by the sensor, the camera response function and the lens attenuation factors (vignetting).

11 PAPERS • NO BENCHMARKS YET

ALTO (Aerial-view Large-scale Terrain-Oriented)

ALTO is a vision-focused dataset for the development and benchmarking of Visual Place Recognition and Localization methods for Unmanned Aerial Vehicles. The dataset is composed of two long (approximately 150km and 260km) trajectories flown by a helicopter over Ohio and Pennsylvania, and it includes high precision GPS-INS ground truth location data, high precision accelerometer readings, laser altimeter readings, and RGB downward facing camera imagery.The dataset also comes with reference imagery over the flight paths, which makes this dataset suitable for VPR benchmarking and other tasks common in Localization, such as image registration and visual odometry.

3 PAPERS • NO BENCHMARKS YET

EndoSLAM (Endoscopic SLAM dataset)

The endoscopic SLAM dataset (EndoSLAM) is a dataset for depth estimation approach for endoscopic videos. It consists of both ex-vivo and synthetically generated data. The ex-vivo part of the dataset includes standard as well as capsule endoscopy recordings. The dataset is divided into 35 sub-datasets. Specifically, 18, 5 and 12 sub-datasets exist for colon, small intestine and stomach respectively.

3 PAPERS • NO BENCHMARKS YET

TUM Visual-Inertial Dataset

A novel dataset with a diverse set of sequences in different scenes for evaluating VI odometry. It provides camera images with 1024x1024 resolution at 20 Hz, high dynamic range and photometric calibration.

3 PAPERS • NO BENCHMARKS YET

Aqualoc

A new underwater dataset that has been recorded in an harbor and provides several sequences with synchronized measurements from a monocular camera, a MEMS-IMU and a pressure sensor.

2 PAPERS • NO BENCHMARKS YET

AUT-VI (Amirkabir campus dataset)

AUT-VI is a super-challenging visual inertial dataset with 126 diverse sequences in 17 locations. This dataset contains dynamic objects, challenging loop-closure/map-reuse, different lighting conditions, reflections, and sudden camera movements to cover all extreme navigation scenarios. Moreover, the Android application for data capture is released to the public to support ongoing development efforts. This dataset aims to exploit the remaining challenges in VIO algorithms, in the hope of improving them to facilitate navigation for visually impaired individuals in both indoor and outdoor settings.

1 PAPER • NO BENCHMARKS YET

BPOD

Brown Pedestrian Odometry Dataset (BPOD) is a dataset for benchmarking visual odometry algorithms in head-mounted pedestrian settings. This dataset was captured using synchronized global and rolling shutter stereo cameras in 12 diverse indoor and outdoor locations on Brown University's campus. Compared to existing datasets, BPOD contains more image blur and self-rotation, which are common in pedestrian odometry but rare elsewhere. Ground-truth trajectories are generated from stick-on markers placed along the pedestrian’s path, and the pedestrian's position is documented using a third-person video.

1 PAPER • NO BENCHMARKS YET

ConsInv Dataset

ConsInv is a stereo RGB + IMU dataset designed for Dynamic SLAM testing and contains two subsets:

1 PAPER • NO BENCHMARKS YET

Drunkard's Dataset

Estimating camera motion in deformable scenes poses a complex and open research challenge. Most existing non-rigid structure from motion techniques assume to observe also static scene parts besides deforming scene parts in order to establish an anchoring reference. However, this assumption does not hold true in certain relevant application cases such as endoscopies. To tackle this issue with a common benchmark, we introduce the Drunkard’s Dataset, a challenging collection of synthetic data targeting visual navigation and reconstruction in deformable environments. This dataset is the first large set of exploratory camera trajectories with ground truth inside 3D scenes where every surface exhibits non-rigid deformations over time. Simulations in realistic 3D buildings lets us obtain a vast amount of data and ground truth labels, including camera poses, RGB images and depth, optical flow and normal maps at high resolution and quality.

1 PAPER • 1 BENCHMARK

KAIST VIO Dataset

This is the dataset for testing the robustness of various VO/VIO methods, acquired on reak UAV.

1 PAPER • NO BENCHMARKS YET

MineNav

MinNav is a synthetic dataset based on the sandbox game Minecraft. The dataset uses several plug-in program to generate rendered image sequences with time-aligned depth maps, surface normal maps and camera poses. Thanks for the large game's community, there is an extremely large number of 3D open-world environment, users can find suitable scenes for shooting and build data sets through it and they can also build scenes in-game.

1 PAPER • NO BENCHMARKS YET

UMA-VI Dataset

UMA-VI Dataset (The UMA-VI dataset: Visual--inertial odometry in low-textured and dynamic illumination environments)

The dataset contains 32 sequences for the evaluation of VI motion estimation methods, totalling ∼80 min of data. The dataset covers challenging conditions (mainly illumination changes and low textured environments) in different degrees and a wide rage of scenarios (including corridors, parking, classrooms, halls, etc.) from two different buildings at the University of Malaga. In general, we provide at least two different sequences within the same scenario, with different illumination conditions or following different trajectories. All sequences were recorded with our VI sensor handheld, except a few that were recorded while mounted in a car.

1 PAPER • NO BENCHMARKS YET

Multi-Spectral Stereo Dataset (RGB, NIR, thermal images, LiDAR, GPS/IMU)

Abstract: We introduce the multi-spectral stereo (MS2) outdoor dataset, including stereo RGB, stereo NIR, stereo thermal, stereo LiDAR data, and GPS/IMU information. Our dataset provides rectified and synchronized 184K data pairs taken from city, residential, road, campus, and suburban areas in the morning, daytime, and nighttime under clear-sky, cloudy, and rainy conditions. We designed the dataset to explore various computer vision algorithms from multi-spectral sensor data to achieve high-level performance, reliability, and robustness against challenging environments.

0 PAPER • NO BENCHMARKS YET

Datasets

21 dataset results for Visual Odometry