Video object detection is the task of detecting objects from a video as opposed to images.
( Image credit: Learning Motion Priors for Efficient Video Object Detection )
In this paper we propose a method that leverages temporal context from the unlabeled frames of a novel camera to improve performance at that camera.
Models and examples built with TensorFlow
Ranked #11 on
Video Object Detection
on ImageNet VID
OBJECT RECOGNITION REAL-TIME OBJECT DETECTION VIDEO OBJECT DETECTION
This paper introduces an online model for object detection in videos designed to run in real-time on low-powered mobile and embedded devices.
The explosive growth in video streaming gives rise to challenges on performing video understanding at high accuracy and low computation cost.
Ranked #4 on
Action Recognition
on Something-Something V2
(using extra training data)
ACTION CLASSIFICATION ACTION RECOGNITION VIDEO OBJECT DETECTION VIDEO RECOGNITION VIDEO UNDERSTANDING
In this work, we argue that aggregating features in the full-sequence level will lead to more discriminative and robust features for video object detection.
Ranked #3 on
Video Object Detection
on ImageNet VID
The accuracy of detection suffers from degenerated object appearances in videos, e. g., motion blur, video defocus, rare poses, etc.
Ranked #7 on
Video Object Detection
on ImageNet VID
High-performance object detection relies on expensive convolutional networks to compute features, often leading to significant challenges in applications, e. g. those that require detecting objects from video streams in real time.
We argue that there are two important cues for humans to recognize objects in videos: the global semantic information and the local localization information.
Ranked #2 on
Video Object Detection
on ImageNet VID
In this paper, we introduce a new design to capture the interactions across the objects in spatio-temporal context.
We provide a large-scale drone captured dataset, VisDrone, which includes four tracks, i. e., (1) image object detection, (2) video object detection, (3) single object tracking, and (4) multi-object tracking.