Video Object Detection

71 papers with code • 8 benchmarks • 12 datasets

Video object detection is the task of detecting objects from a video as opposed to images.

( Image credit: Learning Motion Priors for Efficient Video Object Detection )

Libraries

Use these libraries to find Video Object Detection models and implementations

Most implemented papers

Emerging Properties in Self-Supervised Vision Transformers

facebookresearch/dino ICCV 2021

In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets).

TSM: Temporal Shift Module for Efficient Video Understanding

MIT-HAN-LAB/temporal-shift-module ICCV 2019

The explosive growth in video streaming gives rise to challenges on performing video understanding at high accuracy and low computation cost.

Mobile Video Object Detection with Temporally-Aware Feature Maps

tensorflow/models CVPR 2018

This paper introduces an online model for object detection in videos designed to run in real-time on low-powered mobile and embedded devices.

Towards High Performance Video Object Detection for Mobiles

stanlee321/LightFlow-TensorFlow 16 Apr 2018

In this paper, we present a light weight network architecture for video object detection on mobiles.

Transferable Adversarial Attacks for Image and Video Object Detection

LiangSiyuan21/Adversarial-Attacks-for-Image-and-Video-Object-Detection 30 Nov 2018

Adversarial examples have been demonstrated to threaten many computer vision tasks including object detection.

Context R-CNN: Long Term Temporal Context for Per-Camera Object Detection

tensorflow/models CVPR 2020

In this paper we propose a method that leverages temporal context from the unlabeled frames of a novel camera to improve performance at that camera.

HoughNet: Integrating near and long-range evidence for visual detection

giddyyupp/coco-minitrain 14 Apr 2021

This paper presents HoughNet, a one-stage, anchor-free, voting-based, bottom-up object detection method.

TransVOD: End-to-End Video Object Detection with Spatial-Temporal Transformers

qianyuzqy/TransVOD_Lite 13 Jan 2022

Detection Transformer (DETR) and Deformable DETR have been proposed to eliminate the need for many hand-designed components in object detection while demonstrating good performance as previous complex hand-crafted detectors.

Flow-Guided Feature Aggregation for Video Object Detection

msracver/Flow-Guided-Feature-Aggregation ICCV 2017

The accuracy of detection suffers from degenerated object appearances in videos, e. g., motion blur, video defocus, rare poses, etc.