2D Object Detection
84 papers with code • 14 benchmarks • 59 datasets
Libraries
Use these libraries to find 2D Object Detection models and implementationsDatasets
Subtasks
Most implemented papers
RandomRooms: Unsupervised Pre-training from Synthetic Shapes and Randomized Layouts for 3D Object Detection
In particular, we propose to generate random layouts of a scene by making use of the objects in the synthetic CAD dataset and learn the 3D scene representation by applying object-level contrastive learning on two random scenes generated from the same set of synthetic objects.
Frustum-PointPillars: A Multi-Stage Approach for 3D Object Detection using RGB Camera and LiDAR
We train our network on the KITTI dataset and perform experiments to show the effectiveness of our network.
Grounded Language-Image Pre-training
The unification brings two benefits: 1) it allows GLIP to learn from both detection and grounding data to improve both tasks and bootstrap a good grounding model; 2) GLIP can leverage massive image-text pairs by generating grounding boxes in a self-training fashion, making the learned representation semantic-rich.
Detecting Overlapping Objects in X-ray Security Imagery by a Label-aware Mechanism
One of the key challenges to the X-ray security check is to detect the overlapped items in backpacks or suitcases in the X-ray images.
Multitask AET with Orthogonal Tangent Regularity for Dark Object Detection
To enhance object detection in a dark environment, we propose a novel multitask auto encoding transformation (MAET) model which is able to explore the intrinsic pattern behind illumination translation.
InternImage: Exploring Large-Scale Vision Foundation Models with Deformable Convolutions
Compared to the great progress of large-scale vision transformers (ViTs) in recent years, large-scale models based on convolutional neural networks (CNNs) are still in an early state.
Object Detection with Transformers: A Review
The astounding performance of transformers in natural language processing (NLP) has motivated researchers to explore their applications in computer vision tasks.
OpenAgents: An Open Platform for Language Agents in the Wild
Language agents show potential in being capable of utilizing natural language for varied and intricate tasks in diverse environments, particularly when built upon large language models (LLMs).
SCAResNet: A ResNet Variant Optimized for Tiny Object Detection in Transmission and Distribution Towers
This demonstrates the advantages of our SCAResNet in detecting transmission and distribution towers and its value in tiny object detection.
scikit-image: Image processing in Python
scikit-image is an image processing library that implements algorithms and utilities for use in research, education and industry applications.