Monocular 3D Object Detection
43 papers with code • 13 benchmarks • 4 datasets
Monocular 3D Object Detection is the task to draw 3D bounding box around objects in a single 2D RGB image. It is localization task but without any extra information like depth or other sensors or multiple-images.
In this paper, we study this problem with a practice built on a fully convolutional single-stage detector and propose a general framework FCOS3D.
3D object detection from a single image without LiDAR is a challenging task due to the lack of accurate depth information.
Estimating 3D orientation and translation of objects is essential for infrastructure-less autonomous navigation and driving.
ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
To address this problem, we propose ImVoxelNet, a novel fully convolutional method of 3D object detection based on monocular or multi-view RGB images.
Unlike the existing methods that use sparse LiDAR mainly in a manner of time-consuming iterative post-processing, our model fuses monocular image features and sparse LiDAR features to predict initial depth maps.
We propose a computational framework to jointly parse a single RGB image and reconstruct a holistic 3D configuration composed by a set of CAD models using a stochastic grammar model.
Holistic 3D indoor scene understanding refers to jointly recovering the i) object bounding boxes, ii) room layout, and iii) camera pose, all in 3D.