Object Detection

3789 papers with code • 92 benchmarks • 267 datasets

Object Detection is a computer vision task in which the goal is to detect and locate objects of interest in an image or video. The task involves identifying the position and boundaries of objects in an image, and classifying the objects into different categories. It forms a crucial part of vision recognition, alongside image classification and retrieval.

The state-of-the-art methods can be categorized into two main types: one-stage methods and two stage-methods:

  • One-stage methods prioritize inference speed, and example models include YOLO, SSD and RetinaNet.

  • Two-stage methods prioritize detection accuracy, and example models include Faster R-CNN, Mask R-CNN and Cascade R-CNN.

The most popular benchmark is the MSCOCO dataset. Models are typically evaluated according to a Mean Average Precision metric.

( Image credit: Detectron )

Libraries

Use these libraries to find Object Detection models and implementations
64 papers
28,200
20 papers
2,926
See all 42 libraries.

Bangladeshi Native Vehicle Detection in Wild

bipin-saha/bnvd 20 May 2024

To advance terrestrial object detection research, this paper proposes a native vehicle detection dataset for the most commonly appeared vehicle classes in Bangladesh.

0
20 May 2024

SLAB: Efficient Transformers with Simplified Linear Attention and Progressive Re-parameterized Batch Normalization

xinghaochen/slab 19 May 2024

However, replacing LayerNorm with more efficient BatchNorm in transformer often leads to inferior performance and collapse in training.

19
19 May 2024

FADet: A Multi-sensor 3D Object Detection Network based on Local Featured Attention

ziongo6/fadet 19 May 2024

Camera, LiDAR and radar are common perception sensors for autonomous driving tasks.

6
19 May 2024

Grounding DINO 1.5: Advance the "Edge" of Open-Set Object Detection

idea-research/grounding-dino-1.5-api 16 May 2024

Empirical results demonstrate the effectiveness of Grounding DINO 1. 5, with the Grounding DINO 1. 5 Pro model attaining a 54. 3 AP on the COCO detection benchmark and a 55. 7 AP on the LVIS-minival zero-shot transfer benchmark, setting new records for open-set object detection.

463
16 May 2024

DiverGen: Improving Instance Segmentation by Learning Wider Data Distribution with More Diverse Generative Data

aim-uofa/DiverGen 16 May 2024

Instance segmentation is data-hungry, and as model capacity increases, data scale becomes crucial for improving the accuracy.

26
16 May 2024

Grounded 3D-LLM with Referent Tokens

OpenRobotLab/Grounded_3D-LLM 16 May 2024

Prior studies on 3D scene understanding have primarily developed specialized models for specific tasks or required task-specific fine-tuning.

24
16 May 2024

SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection

naver/shine 16 May 2024

Open-vocabulary object detection (OvOD) has transformed detection into a language-guided task, empowering users to freely define their class vocabularies of interest during inference.

18
16 May 2024

Size-invariance Matters: Rethinking Metrics and Losses for Imbalanced Multi-object Salient Object Detection

ferry-li/si-sod 16 May 2024

This paper explores the size-invariance of evaluation metrics in Salient Object Detection (SOD), especially when multiple targets of diverse sizes co-exist in the same image.

7
16 May 2024

SpecDETR: A Transformer-based Hyperspectral Point Object Detection Network

zhaoxuli123/specdetr 16 May 2024

We develop a simulated hyperSpectral Point Object Detection benchmark termed SPOD, and for the first time, evaluate and compare the performance of current object detection networks and HTD methods on hyperspectral multi-class point object detection.

4
16 May 2024

Towards Task-Compatible Compressible Representations

adeandrade/research 16 May 2024

We evaluate the impact of this idea in the context of input reconstruction more rigorously and extended it to other computer vision tasks.

3
16 May 2024