Bird's-Eye View Semantic Segmentation
15 papers with code • 2 benchmarks • 2 datasets
Most implemented papers
BEVFormer: Learning Bird's-Eye-View Representation from Multi-Camera Images via Spatiotemporal Transformers
In a nutshell, BEVFormer exploits both spatial and temporal information by interacting with spatial and temporal space through predefined grid-shaped BEV queries.
Cross-view Transformers for real-time Map-view Semantic Segmentation
The architecture consists of a convolutional image encoder for each view and cross-view transformer layers to infer a map-view semantic segmentation.
CoBEVT: Cooperative Bird's Eye View Semantic Segmentation with Sparse Transformers
The extensive experiments on the V2V perception dataset, OPV2V, demonstrate that CoBEVT achieves state-of-the-art performance for cooperative BEV semantic segmentation.
MatrixVT: Efficient Multi-Camera to BEV Transformation for 3D Perception
This paper proposes an efficient multi-camera to Bird's-Eye-View (BEV) view transformation method for 3D perception, dubbed MatrixVT.
LandCover.ai: Dataset for Automatic Mapping of Buildings, Woodlands, Water and Roads from Aerial Imagery
km with resolution 50 cm per pixel and 176. 76 sq.
Lift, Splat, Shoot: Encoding Images From Arbitrary Camera Rigs by Implicitly Unprojecting to 3D
By training on the entire camera rig, we provide evidence that our model is able to learn not only how to represent images but how to fuse predictions from all cameras into a single cohesive representation of the scene while being robust to calibration error.
FIERY: Future Instance Prediction in Bird's-Eye View from Surround Monocular Cameras
We present FIERY: a probabilistic future prediction model in bird's-eye view from monocular cameras.
PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images
More specifically, we extend the 3D position embedding (3D PE) in PETR for temporal modeling.
Simple-BEV: What Really Matters for Multi-Sensor BEV Perception?
Building 3D perception systems for autonomous vehicles that do not rely on high-density LiDAR is a critical research problem because of the expense of LiDAR systems compared to cameras and other sensors.
LaRa: Latents and Rays for Multi-Camera Bird's-Eye-View Semantic Segmentation
Recent works in autonomous driving have widely adopted the bird's-eye-view (BEV) semantic map as an intermediate representation of the world.