PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images

In this paper, we propose PETRv2, a unified framework for 3D perception from multi-view images. Based on PETR, PETRv2 explores the effectiveness of temporal modeling, which utilizes the temporal information of previous frames to boost 3D object detection. More specifically, we extend the 3D position embedding (3D PE) in PETR for temporal modeling. The 3D PE achieves the temporal alignment on object position of different frames. A feature-guided position encoder is further introduced to improve the data adaptability of 3D PE. To support for multi-task learning (e.g., BEV segmentation and 3D lane detection), PETRv2 provides a simple yet effective solution by introducing task-specific queries, which are initialized under different spaces. PETRv2 achieves state-of-the-art performance on 3D object detection, BEV segmentation and 3D lane detection. Detailed robustness analysis is also conducted on PETR framework. We hope PETRv2 can serve as a strong baseline for 3D perception. Code is available at \url{https://github.com/megvii-research/PETR}.

PDF Abstract ICCV 2023 PDF ICCV 2023 Abstract

Results from the Paper


Ranked #2 on Bird's-Eye View Semantic Segmentation on nuScenes (IoU lane - 224x480 - 100x100 at 0.5 metric)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Bird's-Eye View Semantic Segmentation nuScenes PETRv2 IoU lane - 224x480 - 100x100 at 0.5 44.8 # 2
3D Object Detection nuScenes Camera Only PETRv2-pure NDS 59.2 # 15
Future Frame false # 1
3D Lane Detection OpenLane PETRv2-V∗ (VoVNetV2 with 400 anchor points) F1 (all) 61.2 # 2
Up & Down - # 12
Curve - # 12
Extreme Weather - # 12
Night - # 12
Intersection - # 12
Merge & Split - # 12
FPS (pytorch) - # 2
3D Lane Detection OpenLane PETRv2-E (EfficientNet) F1 (all) 51.9 # 10
Up & Down - # 12
Curve - # 12
Extreme Weather - # 12
Night - # 12
Intersection - # 12
Merge & Split - # 12
FPS (pytorch) - # 2
3D Lane Detection OpenLane PETRv2-V (VoVNetV2) F1 (all) 57.8 # 4
Up & Down - # 12
Curve - # 12
Extreme Weather - # 12
Night - # 12
Intersection - # 12
Merge & Split - # 12
FPS (pytorch) - # 2

Methods


No methods listed for this paper. Add relevant methods here