Learning Multi-target Tracking with Quadratic Object Interactions
We describe a model for multi-target tracking based on associating collections of candidate detections across frames of a video. In order to model pairwise interactions between different tracks, such as suppression of overlapping tracks and contextual cues about co-occurence of different objects, we augment a standard min-cost flow objective with quadratic terms between detection variables. We learn the parameters of this model using structured prediction and a loss function which approximates the multi-target tracking accuracy. We evaluate two different approaches to finding an optimal set of tracks under model objective based on an LP relaxation and a novel greedy extension to dynamic programming that handles pairwise interactions. We find the greedy algorithm achieves equivalent performance to the LP relaxation while being 2-7x faster than a commercial solver. The resulting model with learned parameters outperforms existing methods across several categories on the KITTI tracking benchmark.
PDF Abstract