Object Detection

3733 papers with code • 91 benchmarks • 262 datasets

Object Detection is a computer vision task in which the goal is to detect and locate objects of interest in an image or video. The task involves identifying the position and boundaries of objects in an image, and classifying the objects into different categories. It forms a crucial part of vision recognition, alongside image classification and retrieval.

The state-of-the-art methods can be categorized into two main types: one-stage methods and two stage-methods:

  • One-stage methods prioritize inference speed, and example models include YOLO, SSD and RetinaNet.

  • Two-stage methods prioritize detection accuracy, and example models include Faster R-CNN, Mask R-CNN and Cascade R-CNN.

The most popular benchmark is the MSCOCO dataset. Models are typically evaluated according to a Mean Average Precision metric.

( Image credit: Detectron )

Libraries

Use these libraries to find Object Detection models and implementations
64 papers
27,933
20 papers
2,924
See all 41 libraries.

Latest papers with no code

Reviewing Intelligent Cinematography: AI research for camera-based video production

no code yet • 8 May 2024

The main discussion categorizes work by four production types: General Production, Virtual Production, Live Production and Aerial Production.

A New Dataset and Comparative Study for Aphid Cluster Detection and Segmentation in Sorghum Fields

no code yet • 7 May 2024

In this study, we trained and evaluated four real-time semantic segmentation models and three object detection models specifically for aphid cluster segmentation and detection.

DriveWorld: 4D Pre-trained Scene Understanding via World Models for Autonomous Driving

no code yet • 7 May 2024

In this paper, we address this challenge by introducing a world model-based autonomous driving 4D representation learning framework, dubbed \emph{DriveWorld}, which is capable of pre-training from multi-camera driving videos in a spatio-temporal fashion.

ViewFormer: Exploring Spatiotemporal Modeling for Multi-View 3D Occupancy Perception via View-Guided Transformers

no code yet • 7 May 2024

Leveraging the proposed view attention as well as an additional multi-frame streaming temporal attention, we introduce ViewFormer, a vision-centric transformer-based framework for spatiotemporal feature aggregation.

Deep Event-based Object Detection in Autonomous Driving: A Survey

no code yet • 7 May 2024

Object detection plays a critical role in autonomous driving, where accurately and efficiently detecting objects in fast-moving scenes is crucial.

A Novel Wide-Area Multiobject Detection System with High-Probability Region Searching

no code yet • 7 May 2024

To address these challenges, this paper presents a hybrid system that incorporates a wide-angle camera, a high-speed search camera, and a galvano-mirror.

Low-light Object Detection

no code yet • 6 May 2024

In this competition we employed a model fusion approach to achieve object detection results close to those of real images.

Salient Object Detection From Arbitrary Modalities

no code yet • 6 May 2024

The most prominent characteristics of AM SOD are that the modality types and modality numbers will be arbitrary or dynamically changed.

Modality Prompts for Arbitrary Modality Salient Object Detection

no code yet • 6 May 2024

A novel modality-adaptive Transformer (MAT) will be proposed to investigate two fundamental challenges of AM SOD, ie more diverse modality discrepancies caused by varying modality types that need to be processed, and dynamic fusion design caused by an uncertain number of modalities present in the inputs of multimodal fusion strategy.

BadFusion: 2D-Oriented Backdoor Attacks against 3D Object Detection

no code yet • 6 May 2024

To tackle this issue, we propose an innovative 2D-oriented backdoor attack against LiDAR-camera fusion methods for 3D object detection, named BadFusion, for preserving trigger effectiveness throughout the entire fusion process.