CrossMatch: Improving Semi-Supervised Object Detection via Multi-Scale Consistency

29 Sep 2021 · Zhuoran Yu, Yen-Cheng Liu, Chih-Yao Ma, Zsolt Kira ·

We present a novel method, CrossMatch, for semi-supervised object detection. Inspired by the fact that teacher/student pseudo-labeling approaches result in a weak and sparse gradient signal due to the difficulty of confidence-thresholding, CrossMatch leverages \textit{multi-scale feature extraction} in object detection. Specifically, we enforce consistency between different scales across the student and teacher networks. To the best of our knowledge, this is the first work to use multi-scale consistency in semi-supervised object detection. Furthermore, unlike prior work that mostly uses hard pseudo-labeling methods, CrossMatch further densifies the gradient signal by enforcing multi-scale consistency through both hard and soft labels. This combination effectively strengthens the weak supervision signal from potentially noisy pseudo-labels. We evaluate our method on MS COCO and Pascal VOC under different experiment protocols, and our method significantly improves on previous state of the arts. Specifically, CrossMatch achieves 17.33 and 21.53 mAP with only 0.5\% and 1\% labeled data respectively on MS COCO, outperforming other state-of-the-art methods by $\sim$3 mAP.

PDF Abstract