Learning Dynamic Siamese Network for Visual Object Tracking

ICCV 2017  ·  Qing Guo, Wei Feng, Ce Zhou, Rui Huang, Liang Wan, Song Wang ·

How to effectively learn temporal variation of target appearance, to exclude the interference of cluttered background, while maintaining real-time response, is an essential problem of visual object tracking. Recently, Siamese networks have shown great potentials of matching based trackers in achieving balanced accuracy and beyond real-time speed. However, they still have a big gap to classification & updating based trackers in tolerating the temporal changes of objects and imaging conditions. In this paper, we propose dynamic Siamese network, via a fast transformation learning model that enables effective online learning of target appearance variation and background suppression from previous frames. We then present elementwise multi-layer fusion to adaptively integrate the network outputs using multi-level deep features. Unlike state-of-the-art trackers, our approach allows the usage of any feasible generally- or particularly-trained features, such as SiamFC and VGG. More importantly, the proposed dynamic Siamese network can be jointly trained as a whole directly on the labeled video sequences, thus can take full advantage of the rich spatial temporal information of moving objects. As a result, our approach achieves state-of-the-art performance on OTB-2013 and VOT-2015 benchmarks, while exhibits superiorly balanced accuracy and real-time response over state-of-the-art competitors.

PDF Abstract

Datasets


Results from the Paper


Results from Other Papers


Task Dataset Model Metric Name Metric Value Rank Source Paper Compare
Visual Object Tracking OTB-2013 DSiam AUC 0.656 # 5

Methods


No methods listed for this paper. Add relevant methods here