Generating robust and reliable correspondences across images is a fundamental task for a diversity of applications.
2) Seeded Graph Neural Network, which utilizes seed matches to pass messages within/across images and predicts assignment costs.
Removing outlier correspondences is one of the critical steps for successful feature-based point cloud registration.
As such, the adverse influence of occluded pixels is suppressed in the cost fusion.
Ranked #1 on Point Clouds on DTU
Finally, a matchability-aware disparity refinement is introduced to improve the depth inference in weakly matchable regions.
Ranked #1 on Stereo Disparity Estimation on KITTI 2015
In this work, we propose a stochastic bundle adjustment algorithm which seeks to decompose the RCS approximately inside the LM iterations to improve the efficiency and scalability.
Temporal camera relocalization estimates the pose with respect to each video frame in sequence, as opposed to one-shot relocalization which focuses on a still image.
This work focuses on mitigating two limitations in the joint learning of local feature detectors and descriptors.
In this paper, we leverage a 3D fully convolutional network for 3D point clouds, and propose a novel and practical learning mechanism that densely predicts both a detection score and a description feature for each 3D point.
Ranked #2 on Point Cloud Registration on KITTI
Compared with other computer vision tasks, it is rather difficult to collect a large-scale MVS dataset as it requires expensive active scanners and labor-intensive process to obtain ground truth 3D structures.
The self-supervised learning of depth and pose from monocular sequences provides an attractive solution by using the photometric consistency of nearby frames as it depends much less on the ground-truth data.
First, to capture the local context of sparse correspondences, the network clusters unordered input correspondences by learning a soft assignment matrix.
Most existing studies on learning local features focus on the patch-based descriptions of individual keypoints, whereas neglecting the spatial relations established from their keypoint locations.
However, one major limitation of current learned MVS approaches is the scalability: the memory-consuming cost volume regularization makes the learned MVS hard to be applied to high-resolution scenes.
Accurate relative pose is one of the key components in visual odometry (VO) and simultaneous localization and mapping (SLAM).
Convolutional Neural Networks (CNNs) have achieved superior performance on object image retrieval, while Bag-of-Words (BoW) models with handcrafted local features still dominate the retrieval of overlapping images in 3D reconstruction.
Learned local descriptors based on Convolutional Neural Networks (CNNs) have achieved significant improvements on patch-based benchmarks, whereas not having demonstrated strong generalization ability on recent benchmarks of image-based 3D reconstruction.
Critical to the registration of point clouds is the establishment of a set of accurate correspondences between points in 3D space.
We present an end-to-end deep learning architecture for depth map inference from multi-view images.
Ranked #14 on Point Clouds on Tanks and Temples (Mean F1 (Intermediate) metric)