Hypercorrelation Squeeze for Few-Shot Segmentation
Few-shot semantic segmentation aims at learning to segment a target object from a query image using only a few annotated support images of the target class. This challenging task requires to understand diverse levels of visual cues and analyze fine-grained correspondence relations between the query and the support images. To address the problem, we propose Hypercorrelation Squeeze Networks (HSNet) that leverages multi-level feature correlation and efficient 4D convolutions. It extracts diverse features from different levels of intermediate convolutional layers and constructs a collection of 4D correlation tensors, i.e., hypercorrelations. Using efficient center-pivot 4D convolutions in a pyramidal architecture, the method gradually squeezes high-level semantic and low-level geometric cues of the hypercorrelation into precise segmentation masks in coarse-to-fine manner. The significant performance improvements on standard few-shot segmentation benchmarks of PASCAL-5i, COCO-20i, and FSS-1000 verify the efficacy of the proposed method.
PDF AbstractCode
Results from the Paper
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Few-Shot Semantic Segmentation | COCO-20i (1-shot) | HSNet (ResNet-101) | Mean IoU | 41.2 | # 60 | |
FB-IoU | 69.1 | # 26 | ||||
learnable parameters (million) | 2.5 | # 3 | ||||
Few-Shot Semantic Segmentation | COCO-20i (1-shot) | HSNet (ResNet-50) | Mean IoU | 39.2 | # 64 | |
FB-IoU | 68.2 | # 29 | ||||
learnable parameters (million) | 2.5 | # 3 | ||||
Few-Shot Semantic Segmentation | COCO-20i (5-shot) | HSNet (ResNet-50) | Mean IoU | 46.9 | # 56 | |
FB-IoU | 70.7 | # 29 | ||||
learnable parameters (million) | 2.5 | # 3 | ||||
Few-Shot Semantic Segmentation | COCO-20i (5-shot) | HSNet (ResNet-101) | Mean IoU | 49.5 | # 40 | |
FB-IoU | 72.4 | # 21 | ||||
learnable parameters (million) | 2.5 | # 3 | ||||
Few-Shot Semantic Segmentation | FSS-1000 (1-shot) | HSNet (VGG-16) | Mean IoU | 82.3 | # 23 | |
Few-Shot Semantic Segmentation | FSS-1000 (1-shot) | HSNet (ResNet-101) | Mean IoU | 86.5 | # 17 | |
Few-Shot Semantic Segmentation | FSS-1000 (1-shot) | HSNet (ResNet-50) | Mean IoU | 85.5 | # 21 | |
Few-Shot Semantic Segmentation | FSS-1000 (5-shot) | HSNet (VGG-16) | Mean IoU | 85.8 | # 20 | |
Few-Shot Semantic Segmentation | FSS-1000 (5-shot) | HSNet (ResNet-50) | Mean IoU | 87.8 | # 18 | |
Few-Shot Semantic Segmentation | FSS-1000 (5-shot) | HSNet (ResNet-101) | Mean IoU | 88.5 | # 14 | |
Few-Shot Semantic Segmentation | PASCAL-5i (1-Shot) | HSNet (ResNet-101) | Mean IoU | 66.2 | # 39 | |
FB-IoU | 77.6 | # 31 | ||||
learnable parameters (million) | 2.5 | # 3 | ||||
Few-Shot Semantic Segmentation | PASCAL-5i (1-Shot) | HSNet (ResNet-50) | Mean IoU | 64.0 | # 63 | |
FB-IoU | 76.7 | # 38 | ||||
learnable parameters (million) | 2.5 | # 3 | ||||
Few-Shot Semantic Segmentation | PASCAL-5i (1-Shot) | HSNet (VGG-16) | Mean IoU | 59.7 | # 83 | |
FB-IoU | 73.4 | # 45 | ||||
Few-Shot Semantic Segmentation | PASCAL-5i (5-Shot) | HSNet (VGG-16) | Mean IoU | 64.1 | # 77 | |
FB-IoU | 76.6 | # 43 | ||||
Few-Shot Semantic Segmentation | PASCAL-5i (5-Shot) | HSNet (ResNet-101) | Mean IoU | 70.4 | # 37 | |
FB-IoU | 80.6 | # 30 | ||||
learnable parameters (million) | 2.5 | # 3 | ||||
Few-Shot Semantic Segmentation | PASCAL-5i (5-Shot) | HSNet (ResNet-50) | Mean IoU | 69.5 | # 45 | |
FB-IoU | 80.6 | # 30 | ||||
learnable parameters (million) | 2.5 | # 3 |