Strong-TransCenter: Improved Multi-Object Tracking based on Transformers with Dense Representations

24 Oct 2022  ·  Amit Galor, Roy Orfaig, Ben-Zion Bobrovsky ·

Transformer networks have been a focus of research in many fields in recent years, being able to surpass the state-of-the-art performance in different computer vision tasks. A few attempts have been made to apply this method to the task of Multiple Object Tracking (MOT), among those the state-of-the-art was TransCenter, a transformer-based MOT architecture with dense object queries for accurately tracking all the objects while keeping reasonable runtime. TransCenter is the first center-based transformer framework for MOT, and is also among the first to show the benefits of using transformer-based architectures for MOT. In this paper we show an improvement to this tracker using post processing mechanism based in the Track-by-Detection paradigm: motion model estimation using Kalman filter and target Re-identification using an embedding network. Our new tracker shows significant improvements in the IDF1 and HOTA metrics and comparable results on the MOTA metric (70.9%, 59.8% and 75.8% respectively) on the MOTChallenge MOT17 test dataset and improvement on all 3 metrics (67.5%, 56.3% and 73.0%) on the MOT20 test dataset. Our tracker is currently ranked first among transformer-based trackers in these datasets. The code is publicly available at: https://github.com/amitgalor18/STC_Tracker

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Multiple Object Tracking with Transformer MOT20 STC_pub HOTA 56.1 # 1
MOTA 73.0 # 1
IDF1 67.6 # 1

Methods