Voxel R-CNN

Introduced by Deng et al. in Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection

Voxel R-CNN is a voxel-based two stage framework for 3D object detection. It consists of a 3D backbone network, a 2D bird-eye-view (BEV) Region Proposal Network and a detect head. Voxel RoI Pooling is devised to extract RoI features directly from raw features for further refinement.

End-to-end, the point clouds are first divided into regular voxels and fed into the 3D backbone network for feature extraction. Then, the 3D feature volumes are converted into BEV representation, on which the 2D backbone and RPN are applied for region proposal generation. Subsequently, Voxel RoI Pooling directly extracts RoI features from the 3D feature volumes. Finally the RoI features are exploited in the detect head for further box refinement.

Source: Voxel R-CNN: Towards High Performance Voxel-based 3D Object Detection

Read Paper See Code

Papers

Paper	Code	Results	Date	Stars

Tasks

Task	Papers	Share
3D Object Detection	4	44.44%
Object Detection	4	44.44%
Feature Importance	1	11.11%

Usage Over Time

This feature is experimental; we are continuously improving our matching algorithm.

Components

Component	Type	Add Remove
RPN	Region Proposal
Voxel RoI Pooling	RoI Feature Extractors

Categories

Add Remove

3D Object Detection Models

Point Cloud Models