no code implementations • 20 Mar 2024 • Xiaosong Jia, Shaoshuai Shi, Zijun Chen, Li Jiang, Wenlong Liao, Tao He, Junchi Yan
As an essential task in autonomous driving (AD), motion prediction aims to predict the future states of surround objects for navigation.
1 code implementation • 14 Mar 2024 • Haiyang Wang, Hao Tang, Li Jiang, Shaoshuai Shi, Muhammad Ferjad Naeem, Hongsheng Li, Bernt Schiele, LiWei Wang
Due to its simple design, this paradigm holds promise for narrowing the architectural gap between vision and language.
Ranked #2 on Video Captioning on MSVD-CTN (using extra training data)
no code implementations • CVPR 2024 • Li Jiang, Shaoshuai Shi, Bernt Schiele
In dynamic 3D environments the ability to recognize a diverse range of objects without the constraints of predefined categories is indispensable for real-world applications.
3D Semantic Segmentation Open Vocabulary Semantic Segmentation +2
3 code implementations • ICCV 2023 • Haiyang Wang, Hao Tang, Shaoshuai Shi, Aoxue Li, Zhenguo Li, Bernt Schiele, LiWei Wang
Jointly processing information from multiple sensors is crucial to achieving accurate and robust perception for reliable autonomous driving systems.
Ranked #8 on 3D Object Detection on nuScenes
1 code implementation • 30 Jun 2023 • Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele
Extensive experimental results demonstrate that the MTR framework achieves state-of-the-art performance on the highly-competitive motion prediction benchmarks, while the MTR++ framework surpasses its precursor, exhibiting enhanced performance and efficiency in predicting accurate multimodal future trajectories for multiple agents.
1 code implementation • ICCV 2023 • Xuesong Chen, Shaoshuai Shi, Chao Zhang, Benjin Zhu, Qiang Wang, Ka Chun Cheung, Simon See, Hongsheng Li
3D multi-object tracking (MOT) is vital for many applications including autonomous driving vehicles and service robots.
no code implementations • CVPR 2023 • Li Jiang, Zetong Yang, Shaoshuai Shi, Vladislav Golyanik, Dengxin Dai, Bernt Schiele
Masked signal modeling has greatly advanced self-supervised pre-training for language and 2D images.
no code implementations • 9 Apr 2023 • Yulu Gao, Chonghao Sima, Shaoshuai Shi, Shangzhe Di, Si Liu, Hongyang Li
With the prevalence of multimodal learning, camera-LiDAR fusion has gained popularity in 3D object detection.
1 code implementation • CVPR 2023 • Hai Wu, Chenglu Wen, Shaoshuai Shi, Xin Li, Cheng Wang
Finally, we develop a semi-supervised pipeline VirConv-S based on a pseudo-label framework.
3 code implementations • CVPR 2023 • Haiyang Wang, Chen Shi, Shaoshuai Shi, Meng Lei, Sen Wang, Di He, Bernt Schiele, LiWei Wang
However, due to the sparse characteristics of point clouds, it is non-trivial to apply a standard transformer on sparse points.
Ranked #1 on 3D Object Detection on waymo cyclist
1 code implementation • ICCV 2023 • Qiming Xia, Jinhao Deng, Chenglu Wen, Hai Wu, Shaoshuai Shi, Xin Li, Cheng Wang
Combining CoIn with an iterative training strategy, we propose a CoIn++ pipeline, which requires only 2% annotations in the KITTI dataset to achieve performance comparable to the fully supervised methods.
1 code implementation • CVPR 2023 • Benjin Zhu, Zhe Wang, Shaoshuai Shi, Hang Xu, Lanqing Hong, Hongsheng Li
We thus propose a Query Contrast mechanism to explicitly enhance queries towards their best-matched GTs over all unmatched query predictions.
1 code implementation • 9 Oct 2022 • Haiyang Wang, Lihe Ding, Shaocong Dong, Shaoshuai Shi, Aoxue Li, Jianan Li, Zhenguo Li, LiWei Wang
We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D.
Ranked #1 on 3D Object Detection on SUN-RGBD
3 code implementations • 27 Sep 2022 • Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele
Predicting multimodal future behavior of traffic participants is essential for robotic vehicles to make safe decisions.
2 code implementations • 20 Sep 2022 • Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele
In this report, we present the 1st place solution for motion prediction track in 2022 Waymo Open Dataset Challenges.
1 code implementation • 19 Jun 2022 • Jiageng Mao, Shaoshuai Shi, Xiaogang Wang, Hongsheng Li
Autonomous driving, in recent years, has been receiving increasing attention for its potential to relieve drivers' burdens and improve the safety of driving.
1 code implementation • 30 May 2022 • Jihan Yang, Shaoshuai Shi, Runyu Ding, Zhe Wang, Xiaojuan Qi
Then, we build a benchmark to assess existing KD methods developed in the 2D domain for 3D object detection upon six well-constructed teacher-student pairs.
1 code implementation • 12 May 2022 • Xuesong Chen, Shaoshuai Shi, Benjin Zhu, Ka Chun Cheung, Hang Xu, Hongsheng Li
Accurate and reliable 3D detection is vital for many applications including autonomous driving vehicles and service robots.
1 code implementation • CVPR 2022 • Haiyang Wang, Shaoshuai Shi, Ze Yang, Rongyao Fang, Qi Qian, Hongsheng Li, Bernt Schiele, LiWei Wang
In order to learn better representations of object shape to enhance cluster features for predicting 3D boxes, we propose a ray-based feature grouping module, which aggregates the point-wise features on object surfaces using a group of determined rays uniformly emitted from cluster centers.
Ranked #13 on 3D Object Detection on ScanNetV2
2 code implementations • ICCV 2021 • Li Jiang, Shaoshuai Shi, Zhuotao Tian, Xin Lai, Shu Liu, Chi-Wing Fu, Jiaya Jia
To address the high cost and challenges of 3D point-level labeling, we present a method for semi-supervised point cloud semantic segmentation to adopt unlabeled point clouds in training to boost the model performance.
1 code implementation • ICCV 2021 • Xiaoyang Guo, Shaoshuai Shi, Xiaogang Wang, Hongsheng Li
Compared with the state-of-the-art stereo detector, our method has improved the 3D detection performance of cars, pedestrians, cyclists by 10. 44%, 5. 69%, 5. 97% mAP respectively on the official KITTI benchmark.
no code implementations • 15 Aug 2021 • Jihan Yang, Shaoshuai Shi, Zhe Wang, Hongsheng Li, Xiaojuan Qi
These specific designs enable the detector to be trained on meticulously refined pseudo labeled target data with denoised training signals, and thus effectively facilitate adapting an object detector to a target domain without requiring annotations.
1 code implementation • CVPR 2021 • Bowen Cheng, Lu Sheng, Shaoshuai Shi, Ming Yang, Dong Xu
Inspired by the back-tracing strategy in the conventional Hough voting methods, in this work, we introduce a new 3D object detection method, named as Back-tracing Representative Points Network (BRNet), which generatively back-traces the representative points from the vote centers and also revisits complementary seed points around these generated points, so as to better capture the fine local structural features surrounding the potential objects from the raw point clouds.
Ranked #17 on 3D Object Detection on ScanNetV2
1 code implementation • CVPR 2021 • Jihan Yang, Shaoshuai Shi, Zhe Wang, Hongsheng Li, Xiaojuan Qi
Then, the detector is iteratively improved on the target domain by alternatively conducting two steps, which are the pseudo label updating with the developed quality-aware triplet memory bank and the model training with curriculum data augmentation.
1 code implementation • 31 Jan 2021 • Shaoshuai Shi, Li Jiang, Jiajun Deng, Zhe Wang, Chaoxu Guo, Jianping Shi, Xiaogang Wang, Hongsheng Li
3D object detection is receiving increasing attention from both industry and academia thanks to its wide applications in various fields.
Ranked #2 on 3D Object Detection on KITTI Cars Easy val
5 code implementations • 31 Dec 2020 • Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, Houqiang Li
In this paper, we take a slightly different viewpoint -- we find that precise positioning of raw points is not essential for high performance 3D object detection and that the coarse voxel granularity can also offer sufficient detection accuracy.
Ranked #4 on 3D Object Detection on KITTI Cars Moderate val
1 code implementation • 28 Aug 2020 • Shaoshuai Shi, Chaoxu Guo, Jihan Yang, Hongsheng Li
In this technical report, we present the top-performing LiDAR-only solutions for 3D detection, 3D tracking and domain adaptation three tracks in Waymo Open Dataset Challenges 2020.
2 code implementations • CVPR 2020 • Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia
Instance segmentation is an important task for scene understanding.
Ranked #5 on 3D Instance Segmentation on STPLS3D
no code implementations • 13 Feb 2020 • Hongwei Yi, Shaoshuai Shi, Mingyu Ding, Jiankai Sun, Kui Xu, Hui Zhou, Zhe Wang, Sheng Li, Guoping Wang
First, the semantic context information in LiDAR is seldom explored in previous works, which may help identify ambiguous vehicles.
12 code implementations • CVPR 2020 • Shaoshuai Shi, Chaoxu Guo, Li Jiang, Zhe Wang, Jianping Shi, Xiaogang Wang, Hongsheng Li
We present a novel and high-performance 3D object detection framework, named PointVoxel-RCNN (PV-RCNN), for accurate 3D object detection from point clouds.
6 code implementations • 8 Jul 2019 • Shaoshuai Shi, Zhe Wang, Jianping Shi, Xiaogang Wang, Hongsheng Li
3D object detection from LiDAR point cloud is a challenging problem in 3D scene understanding and has many practical applications.
2 code implementations • ICLR 2019 • Hongyang Li, Bo Dai, Shaoshuai Shi, Wanli Ouyang, Xiaogang Wang
We argue that the reliable set could guide the feature learning of the less reliable set during training - in spirit of student mimicking teacher behavior and thus pushing towards a more compact class centroid in the feature space.
Ranked #145 on Object Detection on COCO test-dev
13 code implementations • CVPR 2019 • Shaoshuai Shi, Xiaogang Wang, Hongsheng Li
In this paper, we propose PointRCNN for 3D object detection from raw point cloud.
Ranked #2 on Object Detection on KITTI Cars Moderate
no code implementations • ECCV 2018 • Li Jiang, Shaoshuai Shi, Xiaojuan Qi, Jiaya Jia
We propose to add geometric adversarial loss (GAL).