Search Results for author: Xiaozhi Chen

Found 20 papers, 8 papers with code

Adaptive Fusion of Single-View and Multi-View Depth for Autonomous Driving

1 code implementation12 Mar 2024 Junda Cheng, Wei Yin, Kaixuan Wang, Xiaozhi Chen, Shijie Wang, Xin Yang

In this work, we propose a new robustness benchmark to evaluate the depth estimation system under various noisy pose settings.

Autonomous Driving Monocular Depth Estimation

GIM: Learning Generalizable Image Matcher From Internet Videos

no code implementations16 Feb 2024 Xuelun Shen, Zhipeng Cai, Wei Yin, Matthias Müller, Zijun Li, Kaixuan Wang, Xiaozhi Chen, Cheng Wang

Given an architecture, GIM first trains it on standard domain-specific datasets and then combines it with complementary matching methods to create dense labels on nearby frames of novel videos.

Domain Generalization

UC-NeRF: Neural Radiance Field for Under-Calibrated Multi-view Cameras in Autonomous Driving

no code implementations28 Nov 2023 Kai Cheng, Xiaoxiao Long, Wei Yin, Jin Wang, Zhiqiang Wu, Yuexin Ma, Kaixuan Wang, Xiaozhi Chen, Xuejin Chen

Multi-camera setups find widespread use across various applications, such as autonomous driving, as they greatly expand sensing capabilities.

Autonomous Driving Depth Estimation +1

Metric3D: Towards Zero-shot Metric 3D Prediction from A Single Image

1 code implementation ICCV 2023 Wei Yin, Chi Zhang, Hao Chen, Zhipeng Cai, Gang Yu, Kaixuan Wang, Xiaozhi Chen, Chunhua Shen

State-of-the-art (SOTA) monocular metric depth estimation methods can only handle a single camera model and are unable to perform mixed-data training due to the metric ambiguity.

Ranked #19 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)

Image Reconstruction Monocular Depth Estimation +1

Learning to Fuse Monocular and Multi-view Cues for Multi-frame Depth Estimation in Dynamic Scenes

1 code implementation CVPR 2023 Rui Li, Dong Gong, Wei Yin, Hao Chen, Yu Zhu, Kaixuan Wang, Xiaozhi Chen, Jinqiu Sun, Yanning Zhang

To let the geometric perception learned from multi-view cues in static areas propagate to the monocular representation in dynamic areas and let monocular cues enhance the representation of multi-view cost volume, we propose a cross-cue fusion (CCF) module, which includes the cross-cue attention (CCA) to encode the spatially non-local relative intra-relations from each source to enhance the representation of the other.

Autonomous Driving Depth Estimation

Are All Point Clouds Suitable for Completion? Weakly Supervised Quality Evaluation Network for Point Cloud Completion

no code implementations3 Mar 2023 Jieqi Shi, Peiliang Li, Xiaozhi Chen, Shaojie Shen

In this paper, we propose a quality evaluation network to score the point clouds and help judge the quality of the point cloud before applying the completion model.

Autonomous Driving Point Cloud Completion

You Only Label Once: 3D Box Adaptation from Point Cloud to Image via Semi-Supervised Learning

no code implementations17 Nov 2022 Jieqi Shi, Peiliang Li, Xiaozhi Chen, Shaojie Shen

The image-based 3D object detection task expects that the predicted 3D bounding box has a ``tightness'' projection (also referred to as cuboid), which fits the object contour well on the image while still keeping the geometric attribute on the 3D space, e. g., physical dimension, pairwise orthogonal, etc.

3D Object Detection Attribute +1

Multi-Camera Collaborative Depth Prediction via Consistent Structure Estimation

no code implementations5 Oct 2022 Jialei Xu, Xianming Liu, Yuanchao Bai, Junjun Jiang, Kaixuan Wang, Xiaozhi Chen, Xiangyang Ji

During the iterative update, the results of depth estimation are compared across cameras and the information of overlapping areas is propagated to the whole depth maps with the help of basis formulation.

Depth Prediction Monocular Depth Estimation

UniFusion: Unified Multi-view Fusion Transformer for Spatial-Temporal Representation in Bird's-Eye-View

2 code implementations ICCV 2023 Zequn Qin, Jingyu Chen, Chao Chen, Xiaozhi Chen, Xi Li

Bird's eye view (BEV) representation is a new perception formulation for autonomous driving, which is based on spatial fusion.

Autonomous Driving

MonoJSG: Joint Semantic and Geometric Cost Volume for Monocular 3D Object Detection

1 code implementation CVPR 2022 Qing Lian, Peiliang Li, Xiaozhi Chen

Based on the object depth, the dense coordinates patch together with the corresponding object features is reprojected to the image space to build a cost volume in a joint semantic and geometric error manner.

Depth Estimation Monocular 3D Object Detection +2

Temporal Point Cloud Completion with Pose Disturbance

no code implementations7 Feb 2022 Jieqi Shi, Lingyun Xu, Peiliang Li, Xiaozhi Chen, Shaojie Shen

With the help of gated recovery units(GRU) and attention mechanisms as temporal units, we propose a point cloud completion framework that accepts a sequence of unaligned and sparse inputs, and outputs consistent and aligned point clouds.

Point Cloud Completion

Geometry-based Distance Decomposition for Monocular 3D Object Detection

1 code implementation ICCV 2021 Xuepeng Shi, Qi Ye, Xiaozhi Chen, Chuangrong Chen, Zhixiang Chen, Tae-Kyun Kim

The experimental results show that our method achieves the state-of-the-art performance on the monocular 3D Object Detection and Birds Eye View tasks of the KITTI dataset, and can generalize to images with different camera intrinsics.

Autonomous Driving Monocular 3D Object Detection +2

Edge Preserving and Multi-Scale Contextual Neural Network for Salient Object Detection

no code implementations29 Aug 2016 Xiang Wang, Huimin Ma, Xiaozhi Chen, ShaoDi You

In this paper, we propose a novel edge preserving and multi-scale contextual neural network for salient object detection.

Object object-detection +3

Improving Object Proposals With Multi-Thresholding Straddling Expansion

no code implementations CVPR 2015 Xiaozhi Chen, Huimin Ma, Xiang Wang, Zhichen Zhao

Based on the characteristics of superpixel tightness distribution, we propose an effective method, namely multi-thresholding straddling expansion (MTSE) to reduce localization bias via fast diversification.

Object object-detection +1

Cannot find the paper you are looking for? You can Submit a new open access paper.