1 code implementation • 12 Mar 2024 • Junda Cheng, Wei Yin, Kaixuan Wang, Xiaozhi Chen, Shijie Wang, Xin Yang
In this work, we propose a new robustness benchmark to evaluate the depth estimation system under various noisy pose settings.
Ranked #1 on Monocular Depth Estimation on DDAD
no code implementations • 16 Feb 2024 • Xuelun Shen, Zhipeng Cai, Wei Yin, Matthias Müller, Zijun Li, Kaixuan Wang, Xiaozhi Chen, Cheng Wang
Given an architecture, GIM first trains it on standard domain-specific datasets and then combines it with complementary matching methods to create dense labels on nearby frames of novel videos.
no code implementations • 28 Nov 2023 • Kai Cheng, Xiaoxiao Long, Wei Yin, Jin Wang, Zhiqiang Wu, Yuexin Ma, Kaixuan Wang, Xiaozhi Chen, Xuejin Chen
Multi-camera setups find widespread use across various applications, such as autonomous driving, as they greatly expand sensing capabilities.
1 code implementation • ICCV 2023 • Wei Yin, Chi Zhang, Hao Chen, Zhipeng Cai, Gang Yu, Kaixuan Wang, Xiaozhi Chen, Chunhua Shen
State-of-the-art (SOTA) monocular metric depth estimation methods can only handle a single camera model and are unable to perform mixed-data training due to the metric ambiguity.
Ranked #19 on Monocular Depth Estimation on NYU-Depth V2 (using extra training data)
1 code implementation • CVPR 2023 • Rui Li, Dong Gong, Wei Yin, Hao Chen, Yu Zhu, Kaixuan Wang, Xiaozhi Chen, Jinqiu Sun, Yanning Zhang
To let the geometric perception learned from multi-view cues in static areas propagate to the monocular representation in dynamic areas and let monocular cues enhance the representation of multi-view cost volume, we propose a cross-cue fusion (CCF) module, which includes the cross-cue attention (CCA) to encode the spatially non-local relative intra-relations from each source to enhance the representation of the other.
no code implementations • 14 Apr 2023 • Jaime Spencer, C. Stella Qian, Michaela Trescakova, Chris Russell, Simon Hadfield, Erich W. Graf, Wendy J. Adams, Andrew J. Schofield, James Elder, Richard Bowden, Ali Anwar, Hao Chen, Xiaozhi Chen, Kai Cheng, Yuchao Dai, Huynh Thai Hoa, Sadat Hossain, Jianmian Huang, Mohan Jing, Bo Li, Chao Li, Baojun Li, Zhiwen Liu, Stefano Mattoccia, Siegfried Mercelis, Myungwoo Nam, Matteo Poggi, Xiaohua Qi, Jiahui Ren, Yang Tang, Fabio Tosi, Linh Trinh, S. M. Nadim Uddin, Khan Muhammad Umair, Kaixuan Wang, YuFei Wang, Yixing Wang, Mochu Xiang, Guangkai Xu, Wei Yin, Jun Yu, Qi Zhang, Chaoqiang Zhao
This paper discusses the results for the second edition of the Monocular Depth Estimation Challenge (MDEC).
no code implementations • 3 Mar 2023 • Jieqi Shi, Peiliang Li, Xiaozhi Chen, Shaojie Shen
In this paper, we propose a quality evaluation network to score the point clouds and help judge the quality of the point cloud before applying the completion model.
no code implementations • 17 Nov 2022 • Jieqi Shi, Peiliang Li, Xiaozhi Chen, Shaojie Shen
The image-based 3D object detection task expects that the predicted 3D bounding box has a ``tightness'' projection (also referred to as cuboid), which fits the object contour well on the image while still keeping the geometric attribute on the 3D space, e. g., physical dimension, pairwise orthogonal, etc.
no code implementations • 5 Oct 2022 • Jialei Xu, Xianming Liu, Yuanchao Bai, Junjun Jiang, Kaixuan Wang, Xiaozhi Chen, Xiangyang Ji
During the iterative update, the results of depth estimation are compared across cameras and the information of overlapping areas is propagated to the whole depth maps with the help of basis formulation.
2 code implementations • ICCV 2023 • Zequn Qin, Jingyu Chen, Chao Chen, Xiaozhi Chen, Xi Li
Bird's eye view (BEV) representation is a new perception formulation for autonomous driving, which is based on spatial fusion.
1 code implementation • CVPR 2022 • Qing Lian, Peiliang Li, Xiaozhi Chen
Based on the object depth, the dense coordinates patch together with the corresponding object features is reprojected to the image space to build a cost volume in a joint semantic and geometric error manner.
no code implementations • 7 Feb 2022 • Jieqi Shi, Lingyun Xu, Peiliang Li, Xiaozhi Chen, Shaojie Shen
With the help of gated recovery units(GRU) and attention mechanisms as temporal units, we propose a point cloud completion framework that accepts a sequence of unaligned and sparse inputs, and outputs consistent and aligned point clouds.
1 code implementation • ICCV 2021 • Xuepeng Shi, Qi Ye, Xiaozhi Chen, Chuangrong Chen, Zhixiang Chen, Tae-Kyun Kim
The experimental results show that our method achieves the state-of-the-art performance on the monocular 3D Object Detection and Birds Eye View tasks of the KITTI dataset, and can generalize to images with different camera intrinsics.
Ranked #15 on Monocular 3D Object Detection on KITTI Cars Moderate
4 code implementations • CVPR 2019 • Peiliang Li, Xiaozhi Chen, Shaojie Shen
Our method, called Stereo R-CNN, extends Faster R-CNN for stereo inputs to simultaneously detect and associate object in left and right images.
3D Object Detection From Stereo Images Autonomous Driving +3
3 code implementations • CVPR 2017 • Xiaozhi Chen, Huimin Ma, Ji Wan, Bo Li, Tian Xia
We encode the sparse 3D point cloud with a compact multi-view representation.
no code implementations • 29 Aug 2016 • Xiang Wang, Huimin Ma, Xiaozhi Chen, ShaoDi You
In this paper, we propose a novel edge preserving and multi-scale contextual neural network for salient object detection.
no code implementations • 27 Aug 2016 • Xiaozhi Chen, Kaustav Kundu, Yukun Zhu, Huimin Ma, Sanja Fidler, Raquel Urtasun
We then exploit a CNN on top of these proposals to perform object detection.
no code implementations • CVPR 2016 • Xiaozhi Chen, Kaustav Kundu, Ziyu Zhang, Huimin Ma, Sanja Fidler, Raquel Urtasun
The focus of this paper is on proposal generation.
Ranked #8 on Vehicle Pose Estimation on KITTI Cars Hard
no code implementations • NeurIPS 2015 • Xiaozhi Chen, Kaustav Kundu, Yukun Zhu, Andrew G. Berneshawi, Huimin Ma, Sanja Fidler, Raquel Urtasun
The goal of this paper is to generate high-quality 3D object proposals in the context of autonomous driving.
Ranked #10 on Vehicle Pose Estimation on KITTI Cars Hard
no code implementations • CVPR 2015 • Xiaozhi Chen, Huimin Ma, Xiang Wang, Zhichen Zhao
Based on the characteristics of superpixel tightness distribution, we propose an effective method, namely multi-thresholding straddling expansion (MTSE) to reduce localization bias via fast diversification.