Results indicate that integrating our proposed method into state-of-art adaptive Siamese trackers can increase the potential benefits of a template update strategy, and significantly improve performance.
This paper proposes a model adaptation method for Siamese trackers that uses a generative model to produce a synthetic template from the object search regions of several previous frames, rather than directly using the tracker output.
Recent advances in unsupervised domain adaptation have significantly improved the recognition accuracy of CNNs by alleviating the domain shift between (labeled) source and (unlabeled) target data distributions.
Hence, our proposed student-teacher framework is trained to address the occlusion problem by matching the distributions of between- and within-class distances (DCDs) of occluded samples with that of holistic (non-occluded) samples, thereby using the latter as a soft labeled reference to learn well separated DCDs.
Our proposed approach is compared against state-of-the-art methods for compression and STDA of CNNs on the Office31 and ImageClef-DA image classification datasets.
Our Mutual Attention network relies on the joint spatial attention between image and optical flow features maps to activate a common set of salient features across them.
Unsupervised domain adaptation (UDA) seeks to alleviate the problem of domain shift between the distribution of unlabeled data from the target domain w. r. t.
Ranked #2 on Multi-target Domain Adaptation on Office-31
In both datasets, results indicate that our method can achieve the highest level of accuracy while requiring a comparable or lower time complexity.
Although deep neural networks (NNs) have achievedstate-of-the-art accuracy in many visual recognition tasks, the growing computational complexity and energy con-sumption of networks remains an issue, especially for ap-plications on platforms with limited resources and requir-ing real-time processing.
Detecting faces and heads appearing in video feeds are challenging tasks in real-world video surveillance applications due to variations in appearance, occlusions and complex backgrounds.