Target-Aware Tracking with Long-term Context Attention

27 Feb 2023  ·  Kaijie He, Canlong Zhang, Sheng Xie, Zhixin Li, Zhiwen Wang ·

Most deep trackers still follow the guidance of the siamese paradigms and use a template that contains only the target without any contextual information, which makes it difficult for the tracker to cope with large appearance changes, rapid target movement, and attraction from similar objects. To alleviate the above problem, we propose a long-term context attention (LCA) module that can perform extensive information fusion on the target and its context from long-term frames, and calculate the target correlation while enhancing target features. The complete contextual information contains the location of the target as well as the state around the target. LCA uses the target state from the previous frame to exclude the interference of similar objects and complex backgrounds, thus accurately locating the target and enabling the tracker to obtain higher robustness and regression accuracy. By embedding the LCA module in Transformer, we build a powerful online tracker with a target-aware backbone, termed as TATrack. In addition, we propose a dynamic online update algorithm based on the classification confidence of historical information without additional calculation burden. Our tracker achieves state-of-the-art performance on multiple benchmarks, with 71.1\% AUC, 89.3\% NP, and 73.0\% AO on LaSOT, TrackingNet, and GOT-10k. The code and trained models are available on

PDF Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Video Object Tracking GOT-10k TATrack-L-GOT Average Overlap 76.6 # 1
Visual Object Tracking GOT-10k TATrack-L-GOT Average Overlap 76.6 # 5
Success Rate 0.5 85.7 # 6
Success Rate 0.75 73.4 # 4
Visual Tracking LaSOT TATrack-L AUC 71.1 # 1
Visual Object Tracking LaSOT TATrack-L AUC 71.1 # 12
Normalized Precision 79.1 # 14
Precision 76.1 # 13
Visual Object Tracking TrackingNet TATrack-L Precision 84.5 # 7
Normalized Precision 89.3 # 5
Accuracy 85.0 # 8
Visual Tracking TrackingNet TATrack-L ACCURACY 0.85 # 1
Normalized Precision 89.3 # 1