Pedestrian Detection
125 papers with code • 9 benchmarks • 16 datasets
Pedestrian detection is the task of detecting pedestrians from a camera.
Further state-of-the-art results (e.g. on the KITTI dataset) can be found at 3D Object Detection.
( Image credit: High-level Semantic Feature Detection: A New Perspective for Pedestrian Detection )
Libraries
Use these libraries to find Pedestrian Detection models and implementationsDatasets
Most implemented papers
YOLOv3: An Incremental Improvement
At 320x320 YOLOv3 runs in 22 ms at 28. 2 mAP, as accurate as SSD but three times faster.
Focal Loss for Dense Object Detection
Our novel Focal Loss focuses training on a sparse set of hard examples and prevents the vast number of easy negatives from overwhelming the detector during training.
FCOS: Fully Convolutional One-Stage Object Detection
By eliminating the predefined set of anchor boxes, FCOS completely avoids the complicated computation related to anchor boxes such as calculating overlapping during training.
Feature Pyramid Networks for Object Detection
Feature pyramids are a basic component in recognition systems for detecting objects at different scales.
YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
YOLOv7 surpasses all known object detectors in both speed and accuracy in the range from 5 FPS to 160 FPS and has the highest accuracy 56. 8% AP among all known real-time object detectors with 30 FPS or higher on GPU V100.
YOLOv6: A Single-Stage Object Detection Framework for Industrial Applications
The YOLO community has prospered overwhelmingly to enrich its use in a multitude of hardware platforms and abundant scenarios.
Fast Algorithms for Convolutional Neural Networks
The algorithms compute minimal complexity convolution over small tiles, which makes them fast with small filters and small batch sizes.
Beyond Appearance: a Semantic Controllable Self-Supervised Learning Framework for Human-Centric Visual Tasks
Unlike the existing self-supervised learning methods, prior knowledge from human images is utilized in SOLIDER to build pseudo semantic labels and import more semantic information into the learned representation.
Detection in Crowded Scenes: One Proposal, Multiple Predictions
We propose a simple yet effective proposal-based object detector, aiming at detecting highly-overlapped instances in crowded scenes.
Multiview Detection with Feature Perspective Transformation
First, how should we aggregate cues from the multiple views?