Autoregressive Visual Tracking
We present ARTrack, an autoregressive framework for visual object tracking. ARTrack tackles tracking as a coordinate sequence interpretation task that estimates object trajectories progressively, where the current estimate is induced by previous states and in turn affects subsequences. This time-autoregressive approach models the sequential evolution of trajectories to keep tracing the object across frames, making it superior to existing template matching based trackers that only consider the per-frame localization accuracy. ARTrack is simple and direct, eliminating customized localization heads and post-processings. Despite its simplicity, ARTrack achieves state-of-the-art performance on prevailing benchmark datasets.
PDF Abstract CVPR 2023 2023 PDFCode
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Visual Object Tracking | GOT-10k | ARTrack-L | Average Overlap | 78.5 | # 7 | |
Success Rate 0.5 | 87.4 | # 8 | ||||
Success Rate 0.75 | 77.8 | # 5 | ||||
Visual Object Tracking | LaSOT | ARTrack-L | AUC | 73.1 | # 13 | |
Normalized Precision | 82.2 | # 11 | ||||
Precision | 80.3 | # 7 | ||||
Visual Object Tracking | LaSOT-ext | ARTrack-L | AUC | 52.8 | # 12 | |
Normalized Precision | 62.9 | # 10 | ||||
Precision | 59.7 | # 11 | ||||
Video Object Tracking | NT-VOT211 | ARTrack-L | AUC | 35.92 | # 18 | |
Precision | 51.64 | # 14 | ||||
Visual Tracking | TNL2K | ARTrack-L | AUC | 60.3 | # 1 | |
Visual Object Tracking | TNL2K | ARTrack-L | AUC | 60.3 | # 9 | |
Visual Object Tracking | TrackingNet | ARTrack-L | Precision | 86.0 | # 6 | |
Normalized Precision | 89.6 | # 8 | ||||
Accuracy | 85.6 | # 7 | ||||
Visual Object Tracking | UAV123 | ARTrack-L | AUC | 0.712 | # 5 |