Monocular 3D Object Detection

43 papers with code • 13 benchmarks • 4 datasets

Monocular 3D Object Detection is the task to draw 3D bounding box around objects in a single 2D RGB image. It is localization task but without any extra information like depth or other sensors or multiple-images.


Use these libraries to find Monocular 3D Object Detection models and implementations

Most implemented papers

M3D-RPN: Monocular 3D Region Proposal Network for Object Detection

garrickbrazil/M3D-RPN ICCV 2019

Understanding the world in 3D is a critical component of urban autonomous driving.

FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection

open-mmlab/mmdetection3d 22 Apr 2021

In this paper, we study this problem with a practice built on a fully convolutional single-stage detector and propose a general framework FCOS3D.

Learning Depth-Guided Convolutions for Monocular 3D Object Detection

dingmyu/D4LCN CVPR 2020

3D object detection from a single image without LiDAR is a challenging task due to the lack of accurate depth information.

SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation

lzccccc/SMOKE 24 Feb 2020

Estimating 3D orientation and translation of objects is essential for infrastructure-less autonomous navigation and driving.

Kinematic 3D Object Detection in Monocular Video

Nicholasli1995/EgoNet ECCV 2020

In this work, we propose a novel method for monocular video-based 3D object detection which carefully leverages kinematic motion to improve precision of 3D localization.

Objects are Different: Flexible Monocular 3D Object Detection

zhangyp15/MonoFlex CVPR 2021

The precise localization of 3D objects from a single image without depth information is a highly challenging problem.

ImVoxelNet: Image to Voxels Projection for Monocular and Multi-View General-Purpose 3D Object Detection

saic-vul/imvoxelnet 2 Jun 2021

To address this problem, we propose ImVoxelNet, a novel fully convolutional method of 3D object detection based on monocular or multi-view RGB images.

Advancing Self-supervised Monocular Depth Learning with Sparse LiDAR

AutoAILab/FusionDepth 20 Sep 2021

Unlike the existing methods that use sparse LiDAR mainly in a manner of time-consuming iterative post-processing, our model fuses monocular image features and sparse LiDAR features to predict initial depth maps.

Holistic 3D Scene Parsing and Reconstruction from a Single RGB Image

thusiyuan/holistic_scene_parsing ECCV 2018

We propose a computational framework to jointly parse a single RGB image and reconstruct a holistic 3D configuration composed by a set of CAD models using a stochastic grammar model.

Cooperative Holistic Scene Understanding: Unifying 3D Object, Layout, and Camera Pose Estimation

thusiyuan/cooperative_scene_parsing NeurIPS 2018

Holistic 3D indoor scene understanding refers to jointly recovering the i) object bounding boxes, ii) room layout, and iii) camera pose, all in 3D.