Monocular 3D Object Detection
43 papers with code • 13 benchmarks • 4 datasets
Monocular 3D Object Detection is the task to draw 3D bounding box around objects in a single 2D RGB image. It is localization task but without any extra information like depth or other sensors or multiple-images.
Libraries
Use these libraries to find Monocular 3D Object Detection models and implementationsMost implemented papers
M3D-RPN: Monocular 3D Region Proposal Network for Object Detection
Understanding the world in 3D is a critical component of urban autonomous driving.
FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection
In this paper, we study this problem with a practice built on a fully convolutional single-stage detector and propose a general framework FCOS3D.
Learning Depth-Guided Convolutions for Monocular 3D Object Detection
3D object detection from a single image without LiDAR is a challenging task due to the lack of accurate depth information.
SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation
Estimating 3D orientation and translation of objects is essential for infrastructure-less autonomous navigation and driving.
Kinematic 3D Object Detection in Monocular Video
In this work, we propose a novel method for monocular video-based 3D object detection which carefully leverages kinematic motion to improve precision of 3D localization.
Objects are Different: Flexible Monocular 3D Object Detection
The precise localization of 3D objects from a single image without depth information is a highly challenging problem.
ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection
To address this problem, we propose ImVoxelNet, a novel fully convolutional method of 3D object detection based on monocular or multi-view RGB images.
Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR
Unlike the existing methods that use sparse LiDAR mainly in a manner of time-consuming iterative post-processing, our model fuses monocular image features and sparse LiDAR features to predict initial depth maps.
Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image
We propose a computational framework to jointly parse a single RGB image and reconstruct a holistic 3D configuration composed by a set of CAD models using a stochastic grammar model.
Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation
Holistic 3D indoor scene understanding refers to jointly recovering the i) object bounding boxes, ii) room layout, and iii) camera pose, all in 3D.