34 papers with code • 2 benchmarks • 4 datasets
Video object detection is the task of detecting objects from a video as opposed to images.
( Image credit: Learning Motion Priors for Efficient Video Object Detection )
In this paper we propose a method that leverages temporal context from the unlabeled frames of a novel camera to improve performance at that camera.
This paper introduces an online model for object detection in videos designed to run in real-time on low-powered mobile and embedded devices.
In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets).
Ranked #1 on Video Object Detection on DAVIS 2017
In this work, we argue that aggregating features in the full-sequence level will lead to more discriminative and robust features for video object detection.
Ranked #3 on Video Object Detection on ImageNet VID
The accuracy of detection suffers from degenerated object appearances in videos, e. g., motion blur, video defocus, rare poses, etc.
Ranked #8 on Video Object Detection on ImageNet VID
The explosive growth in video streaming gives rise to challenges on performing video understanding at high accuracy and low computation cost.
Ranked #10 on Video Object Detection on ImageNet VID
We argue that there are two important cues for humans to recognize objects in videos: the global semantic information and the local localization information.
Ranked #2 on Video Object Detection on ImageNet VID
In this paper, we introduce a new design to capture the interactions across the objects in spatio-temporal context.
High-performance object detection relies on expensive convolutional networks to compute features, often leading to significant challenges in applications, e. g. those that require detecting objects from video streams in real time.