M3DeTR: Multi-representation, Multi-scale, Mutual-relation 3D Object Detection with Transformers

24 Apr 2021  ยท  Tianrui Guan, Jun Wang, Shiyi Lan, Rohan Chandra, Zuxuan Wu, Larry Davis, Dinesh Manocha ยท

We present a novel architecture for 3D object detection, M3DeTR, which combines different point cloud representations (raw, voxels, bird-eye view) with different feature scales based on multi-scale feature pyramids. M3DeTR is the first approach that unifies multiple point cloud representations, feature scales, as well as models mutual relationships between point clouds simultaneously using transformers. We perform extensive ablation experiments that highlight the benefits of fusing representation and scale, and modeling the relationships. Our method achieves state-of-the-art performance on the KITTI 3D object detection dataset and Waymo Open Dataset. Results show that M3DeTR improves the baseline significantly by 1.48% mAP for all classes on Waymo Open Dataset. In particular, our approach ranks 1st on the well-known KITTI 3D Detection Benchmark for both car and cyclist classes, and ranks 1st on Waymo Open Dataset with single frame point cloud input. Our code is available at: https://github.com/rayguan97/M3DETR.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
3D Object Detection KITTI Cars Easy M3DeTR AP 90.28% # 6
3D Object Detection KITTI Cars Easy val M3DeTR AP 92.29 # 3
3D Object Detection KITTI Cars Hard M3DeTR AP 76.96% # 6
3D Object Detection KITTI Cars Hard val M3DeTR AP 82.85 # 1
3D Object Detection KITTI Cars Moderate M3DeTR AP 81.73% # 9
3D Object Detection KITTI Cars Moderate val M3DeTR AP 85.41 # 2
3D Object Detection KITTI Cyclist Easy val M3DeTR AP 89.13 # 1
3D Object Detection KITTI Cyclist Hard val M3DeTR AP 68.29 # 1
3D Object Detection KITTI Cyclist Moderate val M3DeTR AP 71.70 # 1
3D Object Detection KITTI Cyclists Easy M3DeTR AP 83.83% # 2
3D Object Detection KITTI Cyclists Hard M3DeTR AP 59.03% # 2
3D Object Detection KITTI Cyclists Moderate M3DeTR AP 66.74% # 2
3D Object Detection KITTI Pedestrian Easy val M3DeTR AP 67.64 # 3
3D Object Detection KITTI Pedestrian Hard val M3DeTR AP 56.49 # 2
3D Object Detection KITTI Pedestrian Moderate val M3DeTR AP 60.63 # 3
3D Object Detection KITTI Pedestrians Easy M3DeTR AP 47.05% # 8
3D Object Detection KITTI Pedestrians Hard M3DeTR AP 38.75% # 8
3D Object Detection KITTI Pedestrians Moderate M3DeTR AP 41.02% # 11
3D Object Detection waymo cyclist M3DeTR APH/L2 67.28 # 7
3D Object Detection waymo pedestrian M3DeTR APH/L2 68.20 # 7
3D Object Detection waymo vehicle M3DeTR APH/L2 70.54 # 6
L1 mAP 77.66 # 2
AP 77.09 # 1

Methods


No methods listed for this paper. Add relevant methods here