Doubly Deformable Aggregation of Covariance Matrices for Few-shot Segmentation

30 Jul 2022  ·  Zhitong Xiong, Haopeng Li, Xiao Xiang Zhu ·

Training semantic segmentation models with few annotated samples has great potential in various real-world applications. For the few-shot segmentation task, the main challenge is how to accurately measure the semantic correspondence between the support and query samples with limited training data. To address this problem, we propose to aggregate the learnable covariance matrices with a deformable 4D Transformer to effectively predict the segmentation map. Specifically, in this work, we first devise a novel hard example mining mechanism to learn covariance kernels for the Gaussian process. The learned covariance kernel functions have great advantages over existing cosine similarity-based methods in correspondence measurement. Based on the learned covariance kernels, an efficient doubly deformable 4D Transformer module is designed to adaptively aggregate feature similarity maps into segmentation results. By combining these two designs, the proposed method can not only set new state-of-the-art performance on public benchmarks, but also converge extremely faster than existing methods. Experiments on three public datasets have demonstrated the effectiveness of our method.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Few-Shot Semantic Segmentation COCO-20i (1-shot) DACM (VAT, ResNet-50) Mean IoU 43 # 37
FB-IoU 69.4 # 19
Few-Shot Semantic Segmentation COCO-20i (1-shot) DACM (ResNet-50) Mean IoU 40.6 # 52
FB-IoU 68.9 # 21
Few-Shot Semantic Segmentation COCO-20i (5-shot) DACM (VAT, ResNet-50) Mean IoU 49.2 # 35
FB-IoU 72.9 # 13
Few-Shot Semantic Segmentation COCO-20i (5-shot) DACM (ResNet-50) Mean IoU 48.1 # 40
FB-IoU 71.6 # 21
Few-Shot Semantic Segmentation FSS-1000 (1-shot) DACM (ResNet-101) Mean IoU 90.8 # 1
Few-Shot Semantic Segmentation FSS-1000 (1-shot) DACM (ResNet-50) Mean IoU 90.7 # 2
Few-Shot Semantic Segmentation FSS-1000 (5-shot) DACM (ResNet-101) Mean IoU 91.7 # 1
Few-Shot Semantic Segmentation FSS-1000 (5-shot) DACM (ResNet-50) Mean IoU 91.6 # 2
Few-Shot Semantic Segmentation PASCAL-5i (1-Shot) DACM (VGG-16) Mean IoU 61.8 # 66
FB-IoU 75.5 # 36
Few-Shot Semantic Segmentation PASCAL-5i (1-Shot) DACM (VAT, ResNet-101) Mean IoU 69.1 # 11
FB-IoU 79.4 # 10
Few-Shot Semantic Segmentation PASCAL-5i (1-Shot) DACM (ResNet-101) Mean IoU 67.5 # 22
FB-IoU 78.9 # 14
Few-Shot Semantic Segmentation PASCAL-5i (1-Shot) DACM (VAT, ResNet-50) Mean IoU 66.8 # 29
FB-IoU 78.6 # 16
Few-Shot Semantic Segmentation PASCAL-5i (1-Shot) DACM (ResNet-50) Mean IoU 65.7 # 40
FB-IoU 77.8 # 23
Few-Shot Semantic Segmentation PASCAL-5i (5-Shot) DACM (VGG-16) Mean IoU 65.7 # 65
FB-IoU 77.8 # 33
Few-Shot Semantic Segmentation PASCAL-5i (5-Shot) DACM (ResNet-50) Mean IoU 70.9 # 28
FB-IoU 81.3 # 20
Few-Shot Semantic Segmentation PASCAL-5i (5-Shot) DACM (ResNet-101) Mean IoU 71.4 # 24
FB-IoU 81.5 # 16
Few-Shot Semantic Segmentation PASCAL-5i (5-Shot) DACM (VAT, ResNet-101) Mean IoU 73.3 # 11
FB-IoU 83.1 # 7
Few-Shot Semantic Segmentation PASCAL-5i (5-Shot) DACM (VAT, ResNet-50) Mean IoU 71.7 # 20
FB-IoU 81.7 # 15

Methods