Video Object Tracking
28 papers with code • 3 benchmarks • 11 datasets
Video Object Detection aims to detect targets in videos using both spatial and temporal information. It's usually deeply integrated with tasks such as Object Detection and Object Tracking.
Libraries
Use these libraries to find Video Object Tracking models and implementationsMost implemented papers
Learning Object Permanence from Video
The fourth subtask, where a target object is carried by a containing object, is particularly challenging because it requires a system to reason about a moving location of an invisible object.
Fast Template Matching and Update for Video Object Tracking and Segmentation
Specifically, the reinforcement learning agent learns to decide whether to update the target template according to the quality of the predicted result.
ApproxDet: Content and Contention-Aware Approximate Object Detection for Mobiles
In this paper we introduce ApproxDet, an adaptive video object detection framework for mobile devices to meet accuracy-latency requirements in the face of changing content and resource contention scenarios.
Contrastive Transformation for Self-supervised Correspondence Learning
It is worth mentioning that our method also surpasses the fully-supervised affinity representation (e. g., ResNet) and performs competitively against the recent fully-supervised algorithms designed for the specific tasks (e. g., VOT and VOS).
Attention over learned object embeddings enables complex visual reasoning
Neural networks have achieved success in a wide array of perceptual tasks but often fail at tasks involving both perception and higher-level reasoning.
Hopper: Multi-hop Transformer for Spatiotemporal Reasoning
We evaluate over CATER dataset and find that Hopper achieves 73. 2% Top-1 accuracy using just 1 FPS by hopping through just a few critical frames.
TDIOT: Target-driven Inference for Deep Video Object Tracking
For effective video object tracking, object detection is integrated with a data association step performed by either a custom design inference architecture or an end-to-end joint training for tracking purpose.
Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking
In video object tracking, there exist rich temporal contexts among successive frames, which have been largely overlooked in existing trackers.
Do Different Tracking Tasks Require Different Appearance Models?
We show how most tracking tasks can be solved within this framework, and that the same appearance model can be successfully used to obtain results that are competitive against specialised methods for most of the tasks considered.
BundleTrack: 6D Pose Tracking for Novel Objects without Instance or Category-Level 3D Models
Most prior efforts, however, often assume that the target object's CAD model, at least at a category-level, is available for offline training or during online template matching.