3D Object Detection Models

Voxel R-CNN is a voxel-based two stage framework for 3D object detection. It consists of a 3D backbone network, a 2D bird-eye-view (BEV) Region Proposal Network and a detect head. Voxel RoI Pooling is devised to extract RoI features directly from raw features for further refinement.

End-to-end, the point clouds are first divided into regular voxels and fed into the 3D backbone network for feature extraction. Then, the 3D feature volumes are converted into BEV representation, on which the 2D backbone and RPN are applied for region proposal generation. Subsequently, Voxel RoI Pooling directly extracts RoI features from the 3D feature volumes. Finally the RoI features are exploited in the detect head for further box refinement.

Source: Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection

Papers


Paper Code Results Date Stars

Tasks


Task Papers Share
3D Object Detection 4 44.44%
Object Detection 4 44.44%
Feature Importance 1 11.11%

Categories