no code implementations • 22 May 2024 • Hongkai Chen, Zixin Luo, Yurun Tian, Xuyang Bai, Ziyu Wang, Lei Zhou, Mingmin Zhen, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan
Identifying robust and accurate correspondences across images is a fundamental problem in computer vision that enables various downstream tasks.
no code implementations • CVPR 2024 • Yuanxun Lu, Jingyang Zhang, Shiwei Li, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan, Xun Cao, Yao Yao
The multi-view 2. 5D diffusion directly models the structural distribution of 3D data, while still maintaining the strong generalization ability of the original 2D diffusion model, filling the gap between 2D diffusion-based and direct 3D diffusion-based methods for 3D content generation.
no code implementations • 10 Oct 2023 • Jingyang Zhang, Shiwei Li, Yuanxun Lu, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan, Yao Yao
We introduce JointNet, a novel neural network architecture for modeling the joint distribution of images and an additional dense modality (e. g., depth maps).
no code implementations • ICCV 2023 • Jingyang Zhang, Yao Yao, Shiwei Li, Jingbo Liu, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan
We present a novel differentiable rendering framework for joint geometry, material, and lighting estimation from multi-view images.
1 code implementation • 30 Aug 2022 • Hongkai Chen, Zixin Luo, Lei Zhou, Yurun Tian, Mingmin Zhen, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan
Generating robust and reliable correspondences across images is a fundamental task for a diversity of applications.
no code implementations • CVPR 2022 • Jingyang Zhang, Yao Yao, Shiwei Li, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan
The first one is the Hessian regularization that smoothly diffuses the signed distance values to the entire distance field given noisy and incomplete input.
1 code implementation • 14 Mar 2022 • Yao Yao, Jingyang Zhang, Jingbo Liu, Yihang Qu, Tian Fang, David McKinnon, Yanghai Tsin, Long Quan
We present a differentiable rendering framework for material and lighting estimation from multi-view images and a reconstructed geometry.
1 code implementation • 18 Aug 2020 • Jingyang Zhang, Yao Yao, Shiwei Li, Zixin Luo, Tian Fang
As such, the adverse influence of occluded pixels is suppressed in the cost fusion.
Ranked #1 on Point Clouds on DTU
1 code implementation • 11 Aug 2020 • Jingyang Zhang, Yao Yao, Zixin Luo, Shiwei Li, Tianwei Shen, Tian Fang, Long Quan
Finally, a matchability-aware disparity refinement is introduced to improve the depth inference in weakly matchable regions.
Ranked #2 on Stereo Disparity Estimation on KITTI 2015
no code implementations • ECCV 2020 • Mingmin Zhen, Shiwei Li, Lei Zhou, Jiaxiang Shang, Haoan Feng, Tian Fang, Long Quan
In this paper, we introduce a novel network, called discriminative feature network (DFNet), to address the unsupervised video object segmentation task.
Ranked #1 on Video Object Segmentation on FBMS
1 code implementation • ECCV 2020 • Lei Zhou, Zixin Luo, Mingmin Zhen, Tianwei Shen, Shiwei Li, Zhuofei Huang, Tian Fang, Long Quan
In this work, we propose a stochastic bundle adjustment algorithm which seeks to decompose the RCS approximately inside the LM iterations to improve the efficiency and scalability.
1 code implementation • ECCV 2020 • Jiaxiang Shang, Tianwei Shen, Shiwei Li, Lei Zhou, Mingmin Zhen, Tian Fang, Long Quan
Recent learning-based approaches, in which models are trained by single-view images have shown promising results for monocular 3D face reconstruction, but they suffer from the ill-posed face pose and depth ambiguity issue.
Ranked #7 on 3D Face Reconstruction on REALY (side-view)
no code implementations • CVPR 2020 • Mingmin Zhen, Jinglu Wang, Lei Zhou, Shiwei Li, Tianwei Shen, Jiaxiang Shang, Tian Fang, Quan Long
In this paper, we present a joint multi-task learning framework for semantic segmentation and boundary detection.
1 code implementation • CVPR 2020 • Lei Zhou, Zixin Luo, Tianwei Shen, Jiahui Zhang, Mingmin Zhen, Yao Yao, Tian Fang, Long Quan
Temporal camera relocalization estimates the pose with respect to each video frame in sequence, as opposed to one-shot relocalization which focuses on a still image.
4 code implementations • CVPR 2020 • Zixin Luo, Lei Zhou, Xuyang Bai, Hongkai Chen, Jiahui Zhang, Yao Yao, Shiwei Li, Tian Fang, Long Quan
This work focuses on mitigating two limitations in the joint learning of local feature detectors and descriptors.
2 code implementations • CVPR 2020 • Yao Yao, Zixin Luo, Shiwei Li, Jingyang Zhang, Yufan Ren, Lei Zhou, Tian Fang, Long Quan
Compared with other computer vision tasks, it is rather difficult to collect a large-scale MVS dataset as it requires expensive active scanners and labor-intensive process to obtain ground truth 3D structures.
1 code implementation • 19 Sep 2019 • Tianwei Shen, Lei Zhou, Zixin Luo, Yao Yao, Shiwei Li, Jiahui Zhang, Tian Fang, Long Quan
The self-supervised learning of depth and pose from monocular sequences provides an attractive solution by using the photometric consistency of nearby frames as it depends much less on the ground-truth data.
no code implementations • 22 May 2019 • Mingmin Zhen, Jinglu Wang, Lei Zhou, Tian Fang, Long Quan
On the other hand, it learns more efficiently with the more efficient gradient backpropagation.
Ranked #78 on Semantic Segmentation on NYU Depth v2
1 code implementation • CVPR 2019 • Zixin Luo, Tianwei Shen, Lei Zhou, Jiahui Zhang, Yao Yao, Shiwei Li, Tian Fang, Long Quan
Most existing studies on learning local features focus on the patch-based descriptions of individual keypoints, whereas neglecting the spatial relations established from their keypoint locations.
1 code implementation • CVPR 2019 • Yao Yao, Zixin Luo, Shiwei Li, Tianwei Shen, Tian Fang, Long Quan
However, one major limitation of current learned MVS approaches is the scalability: the memory-consuming cost volume regularization makes the learned MVS hard to be applied to high-resolution scenes.
1 code implementation • 25 Feb 2019 • Tianwei Shen, Zixin Luo, Lei Zhou, Hanyu Deng, Runze Zhang, Tian Fang, Long Quan
Accurate relative pose is one of the key components in visual odometry (VO) and simultaneous localization and mapping (SLAM).
Ranked #3 on Camera Pose Estimation on KITTI Odometry Benchmark
1 code implementation • 26 Nov 2018 • Tianwei Shen, Zixin Luo, Lei Zhou, Runze Zhang, Siyu Zhu, Tian Fang, Long Quan
Convolutional Neural Networks (CNNs) have achieved superior performance on object image retrieval, while Bag-of-Words (BoW) models with handcrafted local features still dominate the retrieval of overlapping images in 3D reconstruction.
1 code implementation • ECCV 2018 • Zixin Luo, Tianwei Shen, Lei Zhou, Siyu Zhu, Runze Zhang, Yao Yao, Tian Fang, Long Quan
Learned local descriptors based on Convolutional Neural Networks (CNNs) have achieved significant improvements on patch-based benchmarks, whereas not having demonstrated strong generalization ability on recent benchmarks of image-based 3D reconstruction.
no code implementations • ECCV 2018 • Lei Zhou, Siyu Zhu, Zixin Luo, Tianwei Shen, Runze Zhang, Mingmin Zhen, Tian Fang, Long Quan
Critical to the registration of point clouds is the establishment of a set of accurate correspondences between points in 3D space.
no code implementations • CVPR 2018 • Shiwei Li, Yao Yao, Tian Fang, Long Quan
We present a novel surface reconstruction method using both curves and point clouds.
no code implementations • CVPR 2018 • Siyu Zhu, Runze Zhang, Lei Zhou, Tianwei Shen, Tian Fang, Ping Tan, Long Quan
This work proposes a divide-and-conquer framework to solve very large global SfM at the scale of millions of images.
5 code implementations • ECCV 2018 • Yao Yao, Zixin Luo, Shiwei Li, Tian Fang, Long Quan
We present an end-to-end deep learning architecture for depth map inference from multi-view images.
Ranked #19 on Point Clouds on Tanks and Temples (Mean F1 (Intermediate) metric)
no code implementations • ICCV 2017 • Lei Zhou, Siyu Zhu, Tianwei Shen, Jinglu Wang, Tian Fang, Long Quan
In this paper, we propose a scale-invariant image matching approach to tackling the very large scale variation of views.
no code implementations • ICCV 2017 • Runze Zhang, Siyu Zhu, Tian Fang, Long Quan
In this paper, we propose a distributed approach to coping with this global bundle adjustment for very large scale Structure-from-Motion computation.
no code implementations • 28 Feb 2017 • Siyu Zhu, Tianwei Shen, Lei Zhou, Runze Zhang, Jinglu Wang, Tian Fang, Long Quan
In this paper, we tackle the accurate and consistent Structure from Motion (SfM) problem, in particular camera registration, far exceeding the memory of a single computer in parallel.
no code implementations • ICCV 2015 • Runze Zhang, Shiwei Li, Tian Fang, Siyu Zhu, Long Quan
To solve this problem, we propose a joint optimization in a hierarchical framework to obtain the final surface segments and corresponding optimal camera clusters.
no code implementations • ICCV 2015 • Jingbo Liu, Jinglu Wang, Tian Fang, Chiew-Lan Tai, Long Quan
In this paper, we propose a structural segmentation algorithm to partition multi-view stereo reconstructed surfaces of large-scale urban environments into structural segments.
no code implementations • CVPR 2014 • Siyu Zhu, Tian Fang, Jianxiong Xiao, Long Quan
To this end, we propose a segment-based approach to readjust the camera poses locally and improve the reconstruction for fine geometry details.