Robust Camera Only 3D Object Detection
11 papers with code • 1 benchmarks • 1 datasets
Most implemented papers
BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers
In a nutshell, BEVFormer exploits both spatial and temporal information by interacting with spatial and temporal space through predefined grid-shaped BEV queries.
BEVDet: High-performance Multi-camera 3D Object Detection in Bird-Eye-View
As a fast version, BEVDet-Tiny scores 31. 2% mAP and 39. 2% NDS on the nuScenes val set.
BEVDepth: Acquisition of Reliable Depth for Multi-view 3D Object Detection
In this research, we propose a new 3D object detector with a trustworthy depth estimation, dubbed BEVDepth, for camera-based Bird's-Eye-View (BEV) 3D object detection.
SRCN3D: Sparse R-CNN 3D for Compact Convolutional Multi-View 3D Object Detection and Tracking
Our novel sparse feature sampling module only utilizes local 2D region of interest (RoI) features calculated by the projection of 3D query boxes for further box refinement, leading to a fully-convolutional and deployment-friendly pipeline.
DETR3D: 3D Object Detection from Multi-view Images via 3D-to-2D Queries
This top-down approach outperforms its bottom-up counterpart in which object bounding box prediction follows per-pixel depth estimation, since it does not suffer from the compounding error introduced by a depth prediction model.
PETR: Position Embedding Transformation for Multi-View 3D Object Detection
Object query can perceive the 3D position-aware features and perform end-to-end object detection.
BEVerse: Unified Perception and Prediction in Birds-Eye-View for Vision-Centric Autonomous Driving
Specifically, BEVerse first performs shared feature extraction and lifting to generate 4D BEV representations from multi-timestamp and multi-view images.
PolarFormer: Multi-camera 3D Object Detection with Polar Transformer
3D object detection in autonomous driving aims to reason "what" and "where" the objects of interest present in a 3D world.
Time Will Tell: New Outlooks and A Baseline for Temporal Multi-View 3D Object Detection
While recent camera-only 3D detection methods leverage multiple timesteps, the limited history they use significantly hampers the extent to which temporal fusion can improve object perception.
Sparse4D: Multi-view 3D Object Detection with Sparse Spatial-Temporal Fusion
Bird-eye-view (BEV) based methods have made great progress recently in multi-view 3D detection task.