no code implementations • 3 May 2024 • Yingshuang Zou, Yikang Ding, Xi Qiu, Haoqian Wang, Haotian Zhang
This paper presents a novel self-supervised two-frame multi-camera metric depth estimation network, termed M${^2}$Depth, which is designed to predict reliable scale-aware surrounding depth in autonomous driving.
no code implementations • 28 Feb 2024 • Jian Liu, Sipeng Zhang, Chuixin Kong, Wenyuan Zhang, Yuhang Wu, Yikang Ding, Borun Xu, Ruibo Ming, Donglai Wei, Xianming Liu
This technical report presents our solution, "occTransformer" for the 3D occupancy prediction track in the autonomous driving challenge at CVPR 2023.
no code implementations • 30 May 2023 • Pengzhi Li, QInxuan Huang, Yikang Ding, Zhiheng Li
During the diffusion process, an iterative guidance strategy is used to generate a final image that aligns with the textual description.
no code implementations • 20 Sep 2022 • Dihe Huang, Ying Chen, Yikang Ding, Jinli Liao, Jianlin Liu, Kai Wu, Qiang Nie, Yong liu, Chengjie Wang, Zhiheng Li
In MDRNet, the Spatial-aware Dimensionality Reduction (SDR) is designed to dynamically focus on the valuable parts of the object during voxel-to-BEV feature transformation.
1 code implementation • 21 Jul 2022 • Yikang Ding, Qingtian Zhu, Xiangyue Liu, Wentao Yuan, Haotian Zhang, Chi Zhang
Supervised multi-view stereo (MVS) methods have achieved remarkable progress in terms of reconstruction quality, but suffer from the challenge of collecting large-scale ground-truth depth.
1 code implementation • 21 Jul 2022 • Wentao Yuan, Qingtian Zhu, Xiangyue Liu, Yikang Ding, Haotian Zhang, Chi Zhang
Recently, Implicit Neural Representations (INRs) parameterized by neural networks have emerged as a powerful and promising tool to represent different kinds of signals due to its continuous, differentiable properties, showing superiorities to classical discretized representations.
1 code implementation • CVPR 2023 • Dihe Huang, Ying Chen, Shang Xu, Yong liu, Wenlong Wu, Yikang Ding, Chengjie Wang, Fan Tang
The detector-free feature matching approaches are currently attracting great attention thanks to their excellent performance.
1 code implementation • 21 Jun 2022 • Yikang Ding, Zhenyang Li, Dihe Huang, Zhiheng Li, Kai Zhang
Learning-based multi-view stereo (MVS) methods have made impressive progress and surpassed traditional methods in recent years.
no code implementations • 28 May 2022 • Jinli Liao, Yikang Ding, Yoli Shavit, Dihe Huang, Shihao Ren, Jia Guo, Wensen Feng, Kai Zhang
In this work, we propose Window-based Transformers (WT) for local feature matching and global feature aggregation in multi-view stereo.
1 code implementation • CVPR 2022 • Yikang Ding, Wentao Yuan, Qingtian Zhu, Haotian Zhang, Xiangyue Liu, Yuanjiang Wang, Xiao Liu
We analogize MVS back to its nature of a feature matching task and therefore propose a powerful Feature Matching Transformer (FMT) to leverage intra- (self-) and inter- (cross-) attention to aggregate long-range context information within and across images.
Ranked #8 on 3D Reconstruction on DTU