Search Results for author: Yikang Ding

Found 10 papers, 5 papers with code

M${^2}$Depth: Self-supervised Two-Frame Multi-camera Metric Depth Estimation

no code implementations • 3 May 2024 • Yingshuang Zou, Yikang Ding, Xi Qiu, Haoqian Wang, Haotian Zhang

This paper presents a novel self-supervised two-frame multi-camera metric depth estimation network, termed M${^2}$Depth, which is designed to predict reliable scale-aware surrounding depth in autonomous driving.

Autonomous Driving Depth Estimation

Paper
Add Code

OccTransformer: Improving BEVFormer for 3D camera-only occupancy prediction

no code implementations • 28 Feb 2024 • Jian Liu, Sipeng Zhang, Chuixin Kong, Wenyuan Zhang, Yuhang Wu, Yikang Ding, Borun Xu, Ruibo Ming, Donglai Wei, Xianming Liu

This technical report presents our solution, "occTransformer" for the 3D occupancy prediction track in the autonomous driving challenge at CVPR 2023.

Autonomous Driving Data Augmentation

Paper
Add Code

LayerDiffusion: Layered Controlled Image Editing with Diffusion Models

no code implementations • 30 May 2023 • Pengzhi Li, QInxuan Huang, Yikang Ding, Zhiheng Li

During the diffusion process, an iterative guidance strategy is used to generate a final image that aligns with the textual description.

Attribute text-guided-image-editing

Paper
Add Code

Rethinking Dimensionality Reduction in Grid-based 3D Object Detection

no code implementations • 20 Sep 2022 • Dihe Huang, Ying Chen, Yikang Ding, Jinli Liao, Jianlin Liu, Kai Wu, Qiang Nie, Yong liu, Chengjie Wang, Zhiheng Li

In MDRNet, the Spatial-aware Dimensionality Reduction (SDR) is designed to dynamically focus on the valuable parts of the object during voxel-to-BEV feature transformation.

3D Object Detection Cloud Detection +3

Paper
Add Code

KD-MVS: Knowledge Distillation Based Self-supervised Learning for Multi-view Stereo

1 code implementation • 21 Jul 2022 • Yikang Ding, Qingtian Zhu, Xiangyue Liu, Wentao Yuan, Haotian Zhang, Chi Zhang

Supervised multi-view stereo (MVS) methods have achieved remarkable progress in terms of reconstruction quality, but suffer from the challenge of collecting large-scale ground-truth depth.

Knowledge Distillation Self-Supervised Learning

Paper
Code

Sobolev Training for Implicit Neural Representations with Approximated Image Derivatives

1 code implementation • 21 Jul 2022 • Wentao Yuan, Qingtian Zhu, Xiangyue Liu, Yikang Ding, Haotian Zhang, Chi Zhang

Recently, Implicit Neural Representations (INRs) parameterized by neural networks have emerged as a powerful and promising tool to represent different kinds of signals due to its continuous, differentiable properties, showing superiorities to classical discretized representations.

Inverse Rendering

Paper
Code

Adaptive Assignment for Geometry Aware Local Feature Matching

1 code implementation • CVPR 2023 • Dihe Huang, Ying Chen, Shang Xu, Yong liu, Wenlong Wu, Yikang Ding, Chengjie Wang, Fan Tang

The detector-free feature matching approaches are currently attracting great attention thanks to their excellent performance.

Feature Correlation

Paper
Code

Enhancing Multi-view Stereo with Contrastive Matching and Weighted Focal Loss

1 code implementation • 21 Jun 2022 • Yikang Ding, Zhenyang Li, Dihe Huang, Zhiheng Li, Kai Zhang

Learning-based multi-view stereo (MVS) methods have made impressive progress and surpassed traditional methods in recent years.

Contrastive Learning

261

Paper
Code

WT-MVSNet: Window-based Transformers for Multi-view Stereo

no code implementations • 28 May 2022 • Jinli Liao, Yikang Ding, Yoli Shavit, Dihe Huang, Shihao Ren, Jia Guo, Wensen Feng, Kai Zhang

In this work, we propose Window-based Transformers (WT) for local feature matching and global feature aggregation in multi-view stereo.

Paper
Add Code

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers

1 code implementation • CVPR 2022 • Yikang Ding, Wentao Yuan, Qingtian Zhu, Haotian Zhang, Xiangyue Liu, Yuanjiang Wang, Xiao Liu

We analogize MVS back to its nature of a feature matching task and therefore propose a powerful Feature Matching Transformer (FMT) to leverage intra- (self-) and inter- (cross-) attention to aggregate long-range context information within and across images.

Ranked #8 on 3D Reconstruction on DTU

3D Reconstruction Feature Correlation

261

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.