Point Cloud Registration-Driven Robust Feature Matching for 3D Siamese Object Tracking
Learning robust feature matching between the template and search area is crucial for 3D Siamese tracking. The core of Siamese feature matching is how to assign high feature similarity on the corresponding points between the template and search area for precise object localization. In this paper, we propose a novel point cloud registration-driven Siamese tracking framework, with the intuition that spatially aligned corresponding points (via 3D registration) tend to achieve consistent feature representations. Specifically, our method consists of two modules, including a tracking-specific nonlocal registration module and a registration-aided Sinkhorn template-feature aggregation module. The registration module targets at the precise spatial alignment between the template and search area. The tracking-specific spatial distance constraint is proposed to refine the cross-attention weights in the nonlocal module for discriminative feature learning. Then, we use the weighted SVD to compute the rigid transformation between the template and search area, and align them to achieve the desired spatially aligned corresponding points. For the feature aggregation model, we formulate the feature matching between the transformed template and search area as an optimal transport problem and utilize the Sinkhorn optimization to search for the outlier-robust matching solution. Also, a registration-aided spatial distance map is built to improve the matching robustness in indistinguishable regions (e.g., smooth surface). Finally, guided by the obtained feature matching map, we aggregate the target information from the template into the search area to construct the target-specific feature, which is then fed into a CenterPoint-like detection head for object localization. Extensive experiments on KITTI, NuScenes and Waymo datasets verify the effectiveness of our proposed method.
PDF Abstract