ByteTrack: Multi-Object Tracking by Associating Every Detection Box

Multi-object tracking (MOT) aims at estimating bounding boxes and identities of objects in videos. Most methods obtain identities by associating detection boxes whose scores are higher than a threshold. The objects with low detection scores, e.g. occluded objects, are simply thrown away, which brings non-negligible true object missing and fragmented trajectories. To solve this problem, we present a simple, effective and generic association method, tracking by associating almost every detection box instead of only the high score ones. For the low score detection boxes, we utilize their similarities with tracklets to recover true objects and filter out the background detections. When applied to 9 different state-of-the-art trackers, our method achieves consistent improvement on IDF1 score ranging from 1 to 10 points. To put forwards the state-of-the-art performance of MOT, we design a simple and strong tracker, named ByteTrack. For the first time, we achieve 80.3 MOTA, 77.3 IDF1 and 63.1 HOTA on the test set of MOT17 with 30 FPS running speed on a single V100 GPU. ByteTrack also achieves state-of-the-art performance on MOT20, HiEve and BDD100K tracking benchmarks. The source code, pre-trained models with deploy versions and tutorials of applying to other trackers are released at https://github.com/ifzhang/ByteTrack.

PDF Abstract arXiv 2021 PDF arXiv 2021 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Benchmark
Multiple Object Tracking BDD100K ByteTrack mMOTA 45.5 # 1
mIDF1 54.8 # 3
Multi-Object Tracking DanceTrack ByteTrack HOTA 47.1 # 6
DetA 70.5 # 9
AssA 31.5 # 5
MOTA 88.2 # 5
IDF1 51.9 # 4
Multi-Object Tracking MOT17 ByteTrack MOTA 80.3 # 3
IDF1 77.3 # 6
Multi-Object Tracking MOT20 ByteTrack MOTA 77.8 # 2
IDF1 75.2 # 5
HOTA 61.3 # 5

Methods