Frustum ConvNet: Sliding Frustums to Aggregate Local Point-Wise Features for Amodal 3D Object Detection

5 Mar 2019  ·  Zhixin Wang, Kui Jia ·

In this work, we propose a novel method termed \emph{Frustum ConvNet (F-ConvNet)} for amodal 3D object detection from point clouds. Given 2D region proposals in an RGB image, our method first generates a sequence of frustums for each region proposal, and uses the obtained frustums to group local points. F-ConvNet aggregates point-wise features as frustum-level feature vectors, and arrays these feature vectors as a feature map for use of its subsequent component of fully convolutional network (FCN), which spatially fuses frustum-level features and supports an end-to-end and continuous estimation of oriented boxes in the 3D space. We also propose component variants of F-ConvNet, including an FCN variant that extracts multi-resolution frustum features, and a refined use of F-ConvNet over a reduced 3D space. Careful ablation studies verify the efficacy of these component variants. F-ConvNet assumes no prior knowledge of the working 3D environment and is thus dataset-agnostic. We present experiments on both the indoor SUN-RGBD and outdoor KITTI datasets. F-ConvNet outperforms all existing methods on SUN-RGBD, and at the time of submission it outperforms all published works on the KITTI benchmark. Code has been made available at: {\url{https://github.com/zhixinwang/frustum-convnet}.}

PDF Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
3D Object Detection KITTI Cars Easy F-ConvNet AP 85.88% # 17
3D Object Detection KITTI Cars Hard F-ConvNet AP 68.08% # 17
3D Object Detection KITTI Cars Moderate F-ConvNet AP 76.51% # 20
3D Object Detection KITTI Cyclists Easy F-ConvNet AP 79.58% # 3
3D Object Detection KITTI Cyclists Hard F-ConvNets AP 57.03% # 5
3D Object Detection KITTI Cyclists Moderate F-ConvNet AP 64.68% # 4
3D Object Detection KITTI Pedestrians Easy F-ConvNet AP 52.37% # 4
3D Object Detection KITTI Pedestrians Hard F-ConvNet AP 41.49% # 4
3D Object Detection KITTI Pedestrians Moderate F-ConvNet AP 43.38% # 6

Methods