no code implementations • 10 Mar 2024 • Guangkai Xu, Yongtao Ge, MingYu Liu, Chengxiang Fan, Kangyang Xie, Zhiyue Zhao, Hao Chen, Chunhua Shen
We show that, simply initializing image understanding models using a pre-trained UNet (or transformer) of diffusion models, it is possible to achieve remarkable transferable performance on fundamental vision perception tasks using a moderate amount of target data (even synthetic data only), including monocular depth, surface normal, image segmentation, matting, human pose estimation, among virtually many others.
no code implementations • 18 Sep 2023 • Xinyi Yu, Liqin Lu, Jintao Rong, Guangkai Xu, Linlin Ou
3D scene reconstruction from 2D images has been a long-standing task.
no code implementations • ICCV 2023 • Guangkai Xu, Wei Yin, Hao Chen, Chunhua Shen, Kai Cheng, Feng Zhao
3D scene reconstruction is a long-standing vision task.
no code implementations • 14 Apr 2023 • Jaime Spencer, C. Stella Qian, Michaela Trescakova, Chris Russell, Simon Hadfield, Erich W. Graf, Wendy J. Adams, Andrew J. Schofield, James Elder, Richard Bowden, Ali Anwar, Hao Chen, Xiaozhi Chen, Kai Cheng, Yuchao Dai, Huynh Thai Hoa, Sadat Hossain, Jianmian Huang, Mohan Jing, Bo Li, Chao Li, Baojun Li, Zhiwen Liu, Stefano Mattoccia, Siegfried Mercelis, Myungwoo Nam, Matteo Poggi, Xiaohua Qi, Jiahui Ren, Yang Tang, Fabio Tosi, Linh Trinh, S. M. Nadim Uddin, Khan Muhammad Umair, Kaixuan Wang, YuFei Wang, Yixing Wang, Mochu Xiang, Guangkai Xu, Wei Yin, Jun Yu, Qi Zhang, Chaoqiang Zhao
This paper discusses the results for the second edition of the Monocular Depth Estimation Challenge (MDEC).
1 code implementation • 29 Jul 2022 • Guangkai Xu, Wei Yin, Jianming Zhang, Oliver Wang, Simon Niklaus, Simon Chen, Jia-Wang Bian
Our method leverages a data-driven prior in the form of a single image depth prediction network trained on large-scale datasets, the output of which is used as an input to our model.
no code implementations • 5 May 2022 • Kai Cheng, Hao Chen, Wei Yin, Guangkai Xu, Xuejin Chen
However, multi-view depth estimation is fundamentally a correspondence-based optimization problem, but previous learning-based methods mainly rely on predefined depth hypotheses to build correspondence as the cost volume and implicitly regularize it to fit depth prediction, deviating from the essence of iterative optimization based on stereo correspondence.
no code implementations • 3 Feb 2022 • Guangkai Xu, Wei Yin, Hao Chen, Chunhua Shen, Kai Cheng, Feng Wu, Feng Zhao
However, in some video-based scenarios such as video depth estimation and 3D scene reconstruction from a video, the unknown scale and shift residing in per-frame prediction may cause the depth inconsistency.