Self-supervised learning has emerged as a powerful tool for depth and ego-motion estimation, leading to state-of-the-art results on benchmark datasets.
In autonomous driving, accurately estimating the state of surrounding obstacles is critical for safe and robust path planning.
By making the sampling of inlier-outlier sets from point-pair correspondences fully differentiable within the keypoint learning framework, we show that are able to simultaneously self-supervise keypoint description and improve keypoint matching.
Detecting and matching robust viewpoint-invariant keypoints is critical for visual SLAM and Structure-from-Motion.
Dense depth estimation from a single image is a key problem in computer vision, with exciting applications in a multitude of robotic tasks.
Learning depth and camera ego-motion from raw unlabeled RGB video streams is seeing exciting progress through self-supervision from strong geometric cues.
Place recognition is a critical component in robot navigation that enables it to re-establish previously visited locations, and simultaneously use this information to correct the drift incurred in its dead-reckoned estimate.
Although cameras are ubiquitous, robotic platforms typically rely on active sensors like LiDAR for direct 3D perception.
Both contributions provide significant performance gains over the state-of-the-art in self-supervised depth and pose estimation on the public KITTI benchmark.
Many model-based Visual Odometry (VO) algorithms have been proposed in the past decade, often restricted to the type of camera optics, or the underlying motion manifold observed.
Traditional stereo algorithms have focused their efforts on reconstruction quality and have largely avoided prioritizing for run time performance.
In this work, we develop a monocular SLAM-aware object recognition system that is able to achieve considerably stronger recognition performance, as compared to classical object recognition systems that function on a frame-by-frame basis.
We propose a simple and useful idea based on cross-ratio constraint for wide-baseline matching and 3D reconstruction.
This paper describes a method by which a robot can acquire an object model by capturing depth imagery of the object as a human moves it through its range of motion.