no code implementations • 8 Mar 2024 • Liting Lin, Heng Fan, Zhipeng Zhang, YaoWei Wang, Yong Xu, Haibin Ling
The shared embeddings, which describe the absolute coordinates of multi-resolution images (namely, the template and search images), are inherited from the pre-trained backbones.
1 code implementation • 2 Dec 2021 • Liting Lin, Heng Fan, Zhipeng Zhang, Yong Xu, Haibin Ling
The potential of Transformer in representation learning remains under-explored.
Ranked #9 on Visual Object Tracking on TrackingNet
1 code implementation • 8 Sep 2020 • Heng Fan, Hexin Bai, Liting Lin, Fan Yang, Peng Chu, Ge Deng, Sijia Yu, Harshit, Mingzhen Huang, Juehuan Liu, Yong Xu, Chunyuan Liao, Lin Yuan, Haibin Ling
The average video length of LaSOT is around 2, 500 frames, where each video contains various challenge factors that exist in real world video footage, such as the targets disappearing and re-appearing.
1 code implementation • CVPR 2019 • Heng Fan, Liting Lin, Fan Yang, Peng Chu, Ge Deng, Sijia Yu, Hexin Bai, Yong Xu, Chunyuan Liao, Haibin Ling
In this paper, we present LaSOT, a high-quality benchmark for Large-scale Single Object Tracking.