Search Results for author: Shaoshuai Shi

Found 33 papers, 27 papers with code

AMP: Autoregressive Motion Prediction Revisited with Next Token Prediction for Autonomous Driving

no code implementations • 20 Mar 2024 • Xiaosong Jia, Shaoshuai Shi, Zijun Chen, Li Jiang, Wenlong Liao, Tao He, Junchi Yan

As an essential task in autonomous driving (AD), motion prediction aims to predict the future states of surround objects for navigation.

Motion Forecasting motion prediction +1

Paper
Add Code

GiT: Towards Generalist Vision Transformer through Universal Language Interface

2 code implementations • 14 Mar 2024 • Haiyang Wang, Hao Tang, Li Jiang, Shaoshuai Shi, Muhammad Ferjad Naeem, Hongsheng Li, Bernt Schiele, LiWei Wang

Due to its simple design, this paradigm holds promise for narrowing the architectural gap between vision and language.

Language Modelling

200

Paper
Code

UniTR: A Unified and Efficient Multi-Modal Transformer for Bird's-Eye-View Representation

2 code implementations • ICCV 2023 • Haiyang Wang, Hao Tang, Shaoshuai Shi, Aoxue Li, Zhenguo Li, Bernt Schiele, LiWei Wang

Jointly processing information from multiple sensors is crucial to achieving accurate and robust perception for reliable autonomous driving systems.

Ranked #8 on 3D Object Detection on nuScenes

3D Object Detection Autonomous Driving +2

323

Paper
Code

MTR++: Multi-Agent Motion Prediction with Symmetric Scene Modeling and Guided Intention Querying

1 code implementation • 30 Jun 2023 • Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele

Extensive experimental results demonstrate that the MTR framework achieves state-of-the-art performance on the highly-competitive motion prediction benchmarks, while the MTR++ framework surpasses its precursor, exhibiting enhanced performance and efficiency in predicting accurate multimodal future trajectories for multiple agents.

Autonomous Driving motion prediction

562

Paper
Code

TrajectoryFormer: 3D Object Tracking Transformer with Predictive Trajectory Hypotheses

1 code implementation • ICCV 2023 • Xuesong Chen, Shaoshuai Shi, Chao Zhang, Benjin Zhu, Qiang Wang, Ka Chun Cheung, Simon See, Hongsheng Li

3D multi-object tracking (MOT) is vital for many applications including autonomous driving vehicles and service robots.

3D Multi-Object Tracking 3D Object Tracking +2

101

Paper
Code

Self-supervised Pre-training with Masked Shape Prediction for 3D Scene Understanding

no code implementations • CVPR 2023 • Li Jiang, Zetong Yang, Shaoshuai Shi, Vladislav Golyanik, Dengxin Dai, Bernt Schiele

Masked signal modeling has greatly advanced self-supervised pre-training for language and 2D images.

Scene Understanding

Paper
Add Code

Sparse Dense Fusion for 3D Object Detection

no code implementations • 9 Apr 2023 • Yulu Gao, Chonghao Sima, Shaoshuai Shi, Shangzhe Di, Si Liu, Hongyang Li

With the prevalence of multimodal learning, camera-LiDAR fusion has gained popularity in 3D object detection.

3D Object Detection Object +1

Paper
Add Code

Virtual Sparse Convolution for Multimodal 3D Object Detection

1 code implementation • CVPR 2023 • Hai Wu, Chenglu Wen, Shaoshuai Shi, Xin Li, Cheng Wang

Finally, we develop a semi-supervised pipeline VirConv-S based on a pseudo-label framework.

3D Object Detection Depth Completion +3

233

Paper
Code

DSVT: Dynamic Sparse Voxel Transformer with Rotated Sets

3 code implementations • CVPR 2023 • Haiyang Wang, Chen Shi, Shaoshuai Shi, Meng Lei, Sen Wang, Di He, Bernt Schiele, LiWei Wang

However, due to the sparse characteristics of point clouds, it is non-trivial to apply a standard transformer on sparse points.

Ranked #1 on 3D Object Detection on nuScenes LiDAR only

3D Object Detection object-detection

4,766

Paper
Code

CoIn: Contrastive Instance Feature Mining for Outdoor 3D Object Detection with Very Limited Annotations

1 code implementation • ICCV 2023 • Qiming Xia, Jinhao Deng, Chenglu Wen, Hai Wu, Shaoshuai Shi, Xin Li, Cheng Wang

Combining CoIn with an iterative training strategy, we propose a CoIn++ pipeline, which requires only 2% annotations in the KITTI dataset to achieve performance comparable to the fully supervised methods.

3D Object Detection Contrastive Learning +2

Paper
Code

ConQueR: Query Contrast Voxel-DETR for 3D Object Detection

1 code implementation • CVPR 2023 • Benjin Zhu, Zhe Wang, Shaoshuai Shi, Hang Xu, Lanqing Hong, Hongsheng Li

We thus propose a Query Contrast mechanism to explicitly enhance queries towards their best-matched GTs over all unmatched query predictions.

3D Object Detection Object +1

101

Paper
Code

CAGroup3D: Class-Aware Grouping for 3D Object Detection on Point Clouds

1 code implementation • 9 Oct 2022 • Haiyang Wang, Lihe Ding, Shaocong Dong, Shaoshuai Shi, Aoxue Li, Jianan Li, Zhenguo Li, LiWei Wang

We present a novel two-stage fully sparse convolutional 3D object detection framework, named CAGroup3D.

Ranked #1 on 3D Object Detection on SUN-RGBD

3D Object Detection object-detection

Paper
Code

Motion Transformer with Global Intention Localization and Local Movement Refinement

2 code implementations • 27 Sep 2022 • Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele

Predicting multimodal future behavior of traffic participants is essential for robotic vehicles to make safe decisions.

motion prediction Trajectory Prediction

562

Paper
Code

MTR-A: 1st Place Solution for 2022 Waymo Open Dataset Challenge -- Motion Prediction

2 code implementations • 20 Sep 2022 • Shaoshuai Shi, Li Jiang, Dengxin Dai, Bernt Schiele

In this report, we present the 1st place solution for motion prediction track in 2022 Waymo Open Dataset Challenges.

motion prediction

562

Paper
Code

3D Object Detection for Autonomous Driving: A Comprehensive Survey

1 code implementation • 19 Jun 2022 • Jiageng Mao, Shaoshuai Shi, Xiaogang Wang, Hongsheng Li

Autonomous driving, in recent years, has been receiving increasing attention for its potential to relieve drivers' burdens and improve the safety of driving.

3D Object Detection Autonomous Driving +1

489

Paper
Code

Towards Efficient 3D Object Detection with Knowledge Distillation

1 code implementation • 30 May 2022 • Jihan Yang, Shaoshuai Shi, Runyu Ding, Zhe Wang, Xiaojuan Qi

Then, we build a benchmark to assess existing KD methods developed in the 2D domain for 3D object detection upon six well-constructed teacher-student pairs.

3D Object Detection Knowledge Distillation +3

106

Paper
Code

MPPNet: Multi-Frame Feature Intertwining with Proxy Points for 3D Temporal Object Detection

1 code implementation • 12 May 2022 • Xuesong Chen, Shaoshuai Shi, Benjin Zhu, Ka Chun Cheung, Hang Xu, Hongsheng Li

Accurate and reliable 3D detection is vital for many applications including autonomous driving vehicles and service robots.

Autonomous Driving object-detection +1

4,285

Paper
Code

RBGNet: Ray-based Grouping for 3D Object Detection

1 code implementation • CVPR 2022 • Haiyang Wang, Shaoshuai Shi, Ze Yang, Rongyao Fang, Qi Qian, Hongsheng Li, Bernt Schiele, LiWei Wang

In order to learn better representations of object shape to enhance cluster features for predicting 3D boxes, we propose a ray-based feature grouping module, which aggregates the point-wise features on object surfaces using a group of determined rays uniformly emitted from cluster centers.

Ranked #13 on 3D Object Detection on ScanNetV2

3D Object Detection Object +1

Paper
Code

Guided Point Contrastive Learning for Semi-supervised Point Cloud Semantic Segmentation

1 code implementation • ICCV 2021 • Li Jiang, Shaoshuai Shi, Zhuotao Tian, Xin Lai, Shu Liu, Chi-Wing Fu, Jiaya Jia

To address the high cost and challenges of 3D point-level labeling, we present a method for semi-supervised point cloud semantic segmentation to adopt unlabeled point clouds in training to boost the model performance.

3D Semantic Segmentation Contrastive Learning +1

478

Paper
Code

LIGA-Stereo: Learning LiDAR Geometry Aware Representations for Stereo-based 3D Detector

1 code implementation • ICCV 2021 • Xiaoyang Guo, Shaoshuai Shi, Xiaogang Wang, Hongsheng Li

Compared with the state-of-the-art stereo detector, our method has improved the 3D detection performance of cars, pedestrians, cyclists by 10. 44%, 5. 69%, 5. 97% mAP respectively on the official KITTI benchmark.

Ranked #2 on 3D Object Detection From Stereo Images on KITTI Cyclists Moderate

3D Object Detection From Stereo Images Stereo Matching

Paper
Code

ST3D++: Denoised Self-training for Unsupervised Domain Adaptation on 3D Object Detection

no code implementations • 15 Aug 2021 • Jihan Yang, Shaoshuai Shi, Zhe Wang, Hongsheng Li, Xiaojuan Qi

These specific designs enable the detector to be trained on meticulously refined pseudo labeled target data with denoised training signals, and thus effectively facilitate adapting an object detector to a target domain without requiring annotations.

3D Object Detection Data Augmentation +5

Paper
Add Code

Back-tracing Representative Points for Voting-based 3D Object Detection in Point Clouds

1 code implementation • CVPR 2021 • Bowen Cheng, Lu Sheng, Shaoshuai Shi, Ming Yang, Dong Xu

Inspired by the back-tracing strategy in the conventional Hough voting methods, in this work, we introduce a new 3D object detection method, named as Back-tracing Representative Points Network (BRNet), which generatively back-traces the representative points from the vote centers and also revisits complementary seed points around these generated points, so as to better capture the fine local structural features surrounding the potential objects from the raw point clouds.

Ranked #17 on 3D Object Detection on ScanNetV2

3D Object Detection Object +1

Paper
Code

ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection

1 code implementation • CVPR 2021 • Jihan Yang, Shaoshuai Shi, Zhe Wang, Hongsheng Li, Xiaojuan Qi

Then, the detector is iteratively improved on the target domain by alternatively conducting two steps, which are the pseudo label updating with the developed quality-aware triplet memory bank and the model training with curriculum data augmentation.

3D Object Detection Data Augmentation +4

282

Paper
Code

PV-RCNN++: Point-Voxel Feature Set Abstraction With Local Vector Representation for 3D Object Detection

1 code implementation • 31 Jan 2021 • Shaoshuai Shi, Li Jiang, Jiajun Deng, Zhe Wang, Chaoxu Guo, Jianping Shi, Xiaogang Wang, Hongsheng Li

3D object detection is receiving increasing attention from both industry and academia thanks to its wide applications in various fields.

Ranked #2 on 3D Object Detection on KITTI Cars Easy val

3D Object Detection Object +1

4,285

Paper
Code

Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection

5 code implementations • 31 Dec 2020 • Jiajun Deng, Shaoshuai Shi, Peiwei Li, Wengang Zhou, Yanyong Zhang, Houqiang Li

In this paper, we take a slightly different viewpoint -- we find that precise positioning of raw points is not essential for high performance 3D object detection and that the coarse voxel granularity can also offer sufficient detection accuracy.

Ranked #4 on 3D Object Detection on KITTI Cars Moderate val

3D Object Detection object-detection +2

4,285

Paper
Code

PV-RCNN: The Top-Performing LiDAR-only Solutions for 3D Detection / 3D Tracking / Domain Adaptation of Waymo Open Dataset Challenges

1 code implementation • 28 Aug 2020 • Shaoshuai Shi, Chaoxu Guo, Jihan Yang, Hongsheng Li

In this technical report, we present the top-performing LiDAR-only solutions for 3D detection, 3D tracking and domain adaptation three tracks in Waymo Open Dataset Challenges 2020.

3D Object Detection Domain Adaptation +1

4,285

Paper
Code

PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation

2 code implementations • CVPR 2020 • Li Jiang, Hengshuang Zhao, Shaoshuai Shi, Shu Liu, Chi-Wing Fu, Jiaya Jia

Instance segmentation is an important task for scene understanding.

Ranked #5 on 3D Instance Segmentation on STPLS3D

3D Instance Segmentation Clustering +3

1,097

Paper
Code

SegVoxelNet: Exploring Semantic Context and Depth-aware Features for 3D Vehicle Detection from Point Cloud

no code implementations • 13 Feb 2020 • Hongwei Yi, Shaoshuai Shi, Mingyu Ding, Jiankai Sun, Kui Xu, Hui Zhou, Zhe Wang, Sheng Li, Guoping Wang

First, the semantic context information in LiDAR is seldom explored in previous works, which may help identify ambiguous vehicles.

Autonomous Driving Semantic Segmentation

Paper
Add Code

PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection

12 code implementations • CVPR 2020 • Shaoshuai Shi, Chaoxu Guo, Li Jiang, Zhe Wang, Jianping Shi, Xiaogang Wang, Hongsheng Li

We present a novel and high-performance 3D object detection framework, named PointVoxel-RCNN (PV-RCNN), for accurate 3D object detection from point clouds.

Ranked #1 on Birds Eye View Object Detection on KITTI Cyclists Easy

Object object-detection +1

4,285

Paper
Code

From Points to Parts: 3D Object Detection from Point Cloud with Part-aware and Part-aggregation Network

6 code implementations • 8 Jul 2019 • Shaoshuai Shi, Zhe Wang, Jianping Shi, Xiaogang Wang, Hongsheng Li

3D object detection from LiDAR point cloud is a challenging problem in 3D scene understanding and has many practical applications.

3D Object Detection Object +2

4,766

Paper
Code

Feature Intertwiner for Object Detection

2 code implementations • ICLR 2019 • Hongyang Li, Bo Dai, Shaoshuai Shi, Wanli Ouyang, Xiaogang Wang

We argue that the reliable set could guide the feature learning of the less reliable set during training - in spirit of student mimicking teacher behavior and thus pushing towards a more compact class centroid in the feature space.

Ranked #145 on Object Detection on COCO test-dev