Tracking Multiple Deformable Objects in Egocentric Videos

CVPR 2023 · Mingzhen Huang, Xiaoxing Li, Jun Hu, Honghong Peng, Siwei Lyu ·

Most existing multiple object tracking (MOT) methods that solely rely on appearance features struggle in tracking highly deformable objects. Other MOT methods that use motion clues to associate identities across frames have difficulty handling egocentric videos effectively or efficiently. In this work, we propose DETracker, a new MOT method that jointly detects and tracks deformable objects in egocentric videos. DETracker uses three novel modules, namely the motion disentanglement network (MDN), the patch association network (PAN) and the patch memory network (PMN), to explicitly tackle the difficulties caused by severe ego motion and fast morphing target objects. DETracker is end-to-end trainable and achieves near real-time speed. We also present DogThruGlasses, a large-scale deformable multi-object tracking dataset, with 150 videos and 73K annotated frames, collected by smart glasses. DETracker outperforms existing state-of-the-art method on the DogThruGlasses dataset and YouTube-Hand dataset.

PDF Abstract