3D Object Detection

585 papers with code • 55 benchmarks • 48 datasets

3D Object Detection is a task in computer vision where the goal is to identify and locate objects in a 3D environment based on their shape, location, and orientation. It involves detecting the presence of objects and determining their location in the 3D space in real-time. This task is crucial for applications such as autonomous vehicles, robotics, and augmented reality.

( Image credit: AVOD )

Benchmarks

Add a Result

These leaderboards are used to track progress in 3D Object Detection

Dataset	Best Model	Compare
KITTI Cars Moderate	GLENet-VR	See all
nuScenes	EA-LSS	See all
SUN-RGBD val	Point-GCC+TR3D+FF	See all
ScanNetV2	UDeerLvic	See all
KITTI Cars Easy	GLENet-VR	See all
KITTI Cars Hard	3D Dual-Fusion	See all
nuScenes Camera Only	Far3D	See all
KITTI Pedestrians Moderate	3D-FCT	See all
KITTI Cyclists Easy	3D-FCT	See all
KITTI Cyclists Moderate	3D-FCT	See all
KITTI Cyclists Hard	SA-Det3D	See all
KITTI Cars Easy val	SA-SSD+EBM	See all
KITTI Cars Moderate val	SA-SSD+EBM	See all
KITTI Cars Hard val	M3DeTR	See all
KITTI Pedestrians Easy	IPOD	See all
KITTI Pedestrians Hard	SVGA-Net	See all
DAIR-V2X-I	MonoUNI	See all
nuscenes Camera-Radar	HyDRa	See all
waymo vehicle	PillarNeXt	See all
Rope3D	MonoUNI	See all
SUN-RGBD	CAGroup3D (Geo Only)	See all
waymo cyclist	DSVT(val)	See all
waymo pedestrian	DSVT(val)	See all
S3DIS	Point-GCC+TR3D	See all
V2XSet	V2X-ViT	See all
nuScenes LiDAR only	DSVT	See all
OPV2V	V2VNet (PointPillar backbone)	See all
KITTI Pedestrian Easy val	PVCNN	See all
KITTI Pedestrian Moderate val	PVCNN	See all
KITTI Pedestrian Hard val	PVCNN	See all
KITTI Cyclist Easy val	PVCNN	See all
KITTI Cyclist Moderate val	PVCNN	See all
KITTI Cyclist Hard val	F-PointNet++ [Qi:2018fd]	See all
aiMotive Dataset	Lidar-Radar-Camera	See all
3D Object Detection on Argoverse2 Camera Only	Far3D	See all
waymo all_ns	CenterPoint	See all
NYU Depth v2	SGPN-CNN	See all
nuScenes-F	RRPN + R101 - F	See all
nuScenes-FB	RRPN + R101 - FB	See all
KITTI Pedestrian Hard	PiFeNet	See all
KITTI Cyclists Moderate val	Deformable PV-RCNN	See all
KITTI Pedestrians Moderate val	Deformable PV-RCNN	See all
Dense Fog	PV-RCNN	See all
KITTI Pedestrian Moderate	PiFeNet	See all
Heavy Snowfall	PV-RCNN	See all
Light Snowfall	PV-RCNN	See all
Clear Weather	PV-RCNN	See all
KITTI Pedestrian Easy	PiFeNet	See all
KITTI Pedestrian	PiFeNet	See all
V2X-SIM	Where2comm	See all
DAIR-V2X	CoBEVFlow	See all
Argoverse2	VoxelNeXt	See all
Cityscapes 3D	TaskPrompter	See all
Argoverse	ky_nctu_mo	See all
IRV2V	CoBEVFlow	See all

Show all 55 benchmarks

Collapse benchmarks

Libraries

Use these libraries to find 3D Object Detection models and implementations

open-mmlab/mmdetection3d

14 papers

4,808

PaddlePaddle/Paddle3D

6 papers

536

open-mmlab/OpenPCDet

5 papers

4,320

DerrickXuNu/OpenCOOD

5 papers

593

See all 11 libraries.

Datasets

Subtasks

Robust 3D Object Detection

Robust BEV Detection

Latest papers with no code

Most implemented Social Latest No code

ContextualFusion: Context-Based Multi-Sensor Fusion for 3D Object Detection in Adverse Operating Conditions

no code yet • 23 Apr 2024

The fusion of multimodal sensor data streams such as camera images and lidar point clouds plays an important role in the operation of autonomous vehicles (AVs).

Paper
Add Code

NeRF-DetS: Enhancing Multi-View 3D Object Detection with Sampling-adaptive Network of Continuous NeRF-based Representation

no code yet • 22 Apr 2024

As a preliminary work, NeRF-Det unifies the tasks of novel view synthesis and 3D perception, demonstrating that perceptual tasks can benefit from novel view synthesis methods like NeRF, significantly improving the performance of indoor multi-view 3D object detection.

Paper
Add Code

Language-Driven Active Learning for Diverse Open-Set 3D Object Detection

no code yet • 19 Apr 2024

In this paper, we propose VisLED, a language-driven active learning framework for diverse open-set 3D Object Detection.

Paper
Add Code

A Point-Based Approach to Efficient LiDAR Multi-Task Perception

no code yet • 19 Apr 2024

Unlike other LiDAR-based multi-task architectures, our proposed PAttFormer does not require separate feature encoders for multiple task-specific point cloud representations, resulting in a network that is 3x smaller and 1. 4x faster while achieving competitive performance on the nuScenes and KITTI benchmarks for autonomous driving perception.

Paper
Add Code

Leveraging 3D LiDAR Sensors to Enable Enhanced Urban Safety and Public Health: Pedestrian Monitoring and Abnormal Activity Detection

no code yet • 17 Apr 2024

The integration of Light Detection and Ranging (LiDAR) and Internet of Things (IoT) technologies offers transformative opportunities for public health informatics in urban safety and pedestrian well-being.

Paper
Add Code

TempBEV: Improving Learned BEV Encoders with Combined Image and BEV Space Temporal Aggregation

no code yet • 17 Apr 2024

These results indicate the overall effectiveness of our approach and make a strong case for aggregating temporal information in both image and BEV latent spaces.

Paper
Add Code

Multimodal 3D Object Detection on Unseen Domains

no code yet • 17 Apr 2024

To this end, we propose CLIX$^\text{3D}$, a multimodal fusion and supervised contrastive learning framework for 3D object detection that performs alignment of object features from same-class samples of different domains while pushing the features from different classes apart.

Paper
Add Code

Equivariant Spatio-Temporal Self-Supervision for LiDAR Object Detection

no code yet • 17 Apr 2024

This can enable improved performance in downstream tasks that are equivariant to such transformations.

Paper
Add Code

VFMM3D: Releasing the Potential of Image by Vision Foundation Model for Monocular 3D Object Detection

no code yet • 15 Apr 2024

Therefore, an effective solution involves transforming monocular images into LiDAR-like representations and employing a LiDAR-based 3D object detector to predict the 3D coordinates of objects.

Paper
Add Code

Run-time Monitoring of 3D Object Detection in Automated Driving Systems Using Early Layer Neural Activation Patterns

no code yet • 11 Apr 2024

To address the real-time operation requirements in ADS, we also introduce a novel introspection method that combines activation patterns from multiple layers of the detector's backbone and report its performance.

Paper
Add Code

3D Object Detection

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers with no code

Content

Benchmarks

Add a Result