Search Results for author: Yikang Ding

Found 14 papers, 6 papers with code

DiST-4D: Disentangled Spatiotemporal Diffusion with Metric Depth for 4D Driving Scene Generation

no code implementations19 Mar 2025 Jiazhe Guo, Yikang Ding, Xiwu Chen, Shuo Chen, Bohan Li, Yingshuang Zou, Xiaoyang Lyu, Feiyang Tan, Xiaojuan Qi, Zhiheng Li, Hao Zhao

To address this, we propose DiST-4D, the first disentangled spatiotemporal diffusion framework for 4D driving scene generation, which leverages metric depth as the core geometric representation.

Novel View Synthesis Scene Generation

MuDG: Taming Multi-modal Diffusion with Gaussian Splatting for Urban Scene Reconstruction

no code implementations13 Mar 2025 Yingshuang Zou, Yikang Ding, Chuanrui Zhang, Jiazhe Guo, Bohan Li, Xiaoyang Lyu, Feiyang Tan, Xiaojuan Qi, Haoqian Wang

Recent breakthroughs in radiance fields have significantly advanced 3D scene reconstruction and novel view synthesis (NVS) in autonomous driving.

3DGS 3D Scene Reconstruction +2

UniScene: Unified Occupancy-centric Driving Scene Generation

no code implementations6 Dec 2024 Bohan Li, Jiazhe Guo, Hongsi Liu, Yingshuang Zou, Yikang Ding, Xiwu Chen, Hu Zhu, Feiyang Tan, Chi Zhang, Tiancai Wang, Shuchang Zhou, Li Zhang, Xiaojuan Qi, Hao Zhao, Mu Yang, Wenjun Zeng, Xin Jin

UniScene employs a progressive generation process that decomposes the complex task of scene generation into two hierarchical steps: (a) first generating semantic occupancy from a customized scene layout as a meta scene representation rich in both semantic and geometric information, and then (b) conditioned on occupancy, generating video and LiDAR data, respectively, with two novel transfer strategies of Gaussian-based Joint Rendering and Prior-guided Sparse Modeling.

Autonomous Driving Scene Generation

M${^2}$Depth: Self-supervised Two-Frame Multi-camera Metric Depth Estimation

no code implementations3 May 2024 Yingshuang Zou, Yikang Ding, Xi Qiu, Haoqian Wang, Haotian Zhang

This paper presents a novel self-supervised two-frame multi-camera metric depth estimation network, termed M${^2}$Depth, which is designed to predict reliable scale-aware surrounding depth in autonomous driving.

Autonomous Driving Depth Estimation

OccTransformer: Improving BEVFormer for 3D camera-only occupancy prediction

no code implementations28 Feb 2024 Jian Liu, Sipeng Zhang, Chuixin Kong, Wenyuan Zhang, Yuhang Wu, Yikang Ding, Borun Xu, Ruibo Ming, Donglai Wei, Xianming Liu

This technical report presents our solution, "occTransformer" for the 3D occupancy prediction track in the autonomous driving challenge at CVPR 2023.

Autonomous Driving Data Augmentation +1

LayerDiffusion: Layered Controlled Image Editing with Diffusion Models

no code implementations30 May 2023 Pengzhi Li, QInxuan Huang, Yikang Ding, Zhiheng Li

During the diffusion process, an iterative guidance strategy is used to generate a final image that aligns with the textual description.

Attribute text-guided-image-editing

Rethinking Dimensionality Reduction in Grid-based 3D Object Detection

no code implementations20 Sep 2022 Dihe Huang, Ying Chen, Yikang Ding, Jinli Liao, Jianlin Liu, Kai Wu, Qiang Nie, Yong liu, Chengjie Wang, Zhiheng Li

In MDRNet, the Spatial-aware Dimensionality Reduction (SDR) is designed to dynamically focus on the valuable parts of the object during voxel-to-BEV feature transformation.

3D Object Detection Cloud Detection +3

KD-MVS: Knowledge Distillation Based Self-supervised Learning for Multi-view Stereo

1 code implementation21 Jul 2022 Yikang Ding, Qingtian Zhu, Xiangyue Liu, Wentao Yuan, Haotian Zhang, Chi Zhang

Supervised multi-view stereo (MVS) methods have achieved remarkable progress in terms of reconstruction quality, but suffer from the challenge of collecting large-scale ground-truth depth.

Knowledge Distillation Self-Supervised Learning

Sobolev Training for Implicit Neural Representations with Approximated Image Derivatives

1 code implementation21 Jul 2022 Wentao Yuan, Qingtian Zhu, Xiangyue Liu, Yikang Ding, Haotian Zhang, Chi Zhang

Recently, Implicit Neural Representations (INRs) parameterized by neural networks have emerged as a powerful and promising tool to represent different kinds of signals due to its continuous, differentiable properties, showing superiorities to classical discretized representations.

Inverse Rendering

Adaptive Assignment for Geometry Aware Local Feature Matching

1 code implementation CVPR 2023 Dihe Huang, Ying Chen, Shang Xu, Yong liu, Wenlong Wu, Yikang Ding, Chengjie Wang, Fan Tang

The detector-free feature matching approaches are currently attracting great attention thanks to their excellent performance.

Feature Correlation

Enhancing Multi-view Stereo with Contrastive Matching and Weighted Focal Loss

1 code implementation21 Jun 2022 Yikang Ding, Zhenyang Li, Dihe Huang, Zhiheng Li, Kai Zhang

Learning-based multi-view stereo (MVS) methods have made impressive progress and surpassed traditional methods in recent years.

Contrastive Learning

WT-MVSNet: Window-based Transformers for Multi-view Stereo

no code implementations28 May 2022 Jinli Liao, Yikang Ding, Yoli Shavit, Dihe Huang, Shihao Ren, Jia Guo, Wensen Feng, Kai Zhang

In this work, we propose Window-based Transformers (WT) for local feature matching and global feature aggregation in multi-view stereo.

TransMVSNet: Global Context-aware Multi-view Stereo Network with Transformers

1 code implementation CVPR 2022 Yikang Ding, Wentao Yuan, Qingtian Zhu, Haotian Zhang, Xiangyue Liu, Yuanjiang Wang, Xiao Liu

We analogize MVS back to its nature of a feature matching task and therefore propose a powerful Feature Matching Transformer (FMT) to leverage intra- (self-) and inter- (cross-) attention to aggregate long-range context information within and across images.

3D Reconstruction Feature Correlation

Cannot find the paper you are looking for? You can Submit a new open access paper.