We propose an Efficient Activity Detection System, Argus, for Extended Video Analysis in the surveillance scenario.
The key innovations of the proposed method include adaptive spatial feature selection and temporal consistent constraints, with which the new tracker enables joint spatial-temporal filter learning in a lower dimensional discriminative manifold.
It is worth mentioning that our method also surpasses the fully-supervised affinity representation (e. g., ResNet) and performs competitively against the recent fully-supervised algorithms designed for the specific tasks (e. g., VOT and VOS).
We identify that online object tracking poses two new challenges: 1) it is difficult to generate imperceptible perturbations that can transfer across frames, and 2) real-time trackers require the attack to satisfy a certain level of efficiency.
Results: We build a baseline tracker on top of the CNN model and demonstrate that our approach based on the ConvLSTM outperforms the baseline in tool presence detection, spatial localization, and motion tracking by over 5. 0%, 13. 9%, and 12. 6%, respectively.
Ranked #1 on Surgical tool detection on Cholec80
Specifically, the reinforcement learning agent learns to decide whether to update the target template according to the quality of the predicted result.
In this paper we introduce ApproxDet, an adaptive video object detection framework for mobile devices to meet accuracy-latency requirements in the face of changing content and resource contention scenarios.