Target Transformed Regression for Accurate Tracking

1 Apr 2021  ·  Yutao Cui, Cheng Jiang, LiMin Wang, Gangshan Wu ·

Accurate tracking is still a challenging task due to appearance variations, pose and view changes, and geometric deformations of target in videos. Recent anchor-free trackers provide an efficient regression mechanism but fail to produce precise bounding box estimation. To address these issues, this paper repurposes a Transformer-alike regression branch, termed as Target Transformed Regression (TREG), for accurate anchor-free tracking. The core to our TREG is to model pair-wise relation between elements in target template and search region, and use the resulted target enhanced visual representation for accurate bounding box regression. This target contextualized representation is able to enhance the target relevant information to help precisely locate the box boundaries, and deal with the object deformation to some extent due to its local and dense matching mechanism. In addition, we devise a simple online template update mechanism to select reliable templates, increasing the robustness for appearance variations and geometric deformations of target in time. Experimental results on visual tracking benchmarks including VOT2018, VOT2019, OTB100, GOT10k, NFS, UAV123, LaSOT and TrackingNet demonstrate that TREG obtains the state-of-the-art performance, achieving a success rate of 0.640 on LaSOT, while running at around 30 FPS. The code and models will be made available at

PDF Abstract

Results from the Paper

Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Visual Object Tracking GOT-10k TREG Average Overlap 66.8 # 21
Success Rate 0.5 77.8 # 16
Success Rate 0.75 57.2 # 17
Visual Object Tracking TrackingNet TREG Precision 75 # 18
Normalized Precision 83.8 # 20
Accuracy 78.5 # 21
Visual Object Tracking UAV123 TREG AUC 0.669 # 13
Precision 0.884 # 3
Visual Object Tracking VOT2018 TREG Expected Average Overlap (EAO) 0.496 # 1
Accuracy 61.2 # 1
Visual Object Tracking VOT2019 TREG Expected Average Overlap (EAO) 0.391 # 1
Accuracy 60.3 # 1


No methods listed for this paper. Add relevant methods here