Leveraging Instance-, Image- and Dataset-Level Information for Weakly Supervised Instance Segmentation
Weakly supervised semantic instance segmentation with only image-level supervision, instead of relying on expensive pixel wise masks or bounding box annotations, is an important problem to alleviate the data-hungry nature of deep learning. In this paper, we tackle this challenging problem by aggregating the image-level information of all training images into a large knowledge graph and exploiting semantic relationships from this graph. Specifically, our effort starts with some generic segment-based object proposals (SOP) without category priors. We propose a multiple instance learning (MIL) framework, which can be trained in an end-to-end manner using training images with image-level labels. For each proposal, this MIL framework can simultaneously compute probability distributions and category-aware semantic features, with which we can formulate a large undirected graph. The category of background is also included in this graph to remove the massive noisy object proposals. An optimal multi-way cut of this graph can thus assign a reliable category label to each proposal. The denoised SOP with assigned category labels can be viewed as pseudo instance segmentation of training images, which are used to train fully supervised models. The proposed approach achieves state-of-the-art performance for both weakly supervised instance segmentation and semantic segmentation.
PDF AbstractCode
Datasets
Results from the Paper
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Uses Extra Training Data |
Benchmark |
---|---|---|---|---|---|---|---|
Image-level Supervised Instance Segmentation | COCO test-dev | LIID | AP | 16.0 | # 4 | ||
AP@50 | 27.1 | # 5 | |||||
AP@75 | 16.5 | # 4 | |||||
Weakly-Supervised Semantic Segmentation | PASCAL VOC 2012 test | LIID | Mean IoU | 67.5 | # 57 | ||
Image-level Supervised Instance Segmentation | PASCAL VOC 2012 val | LIID | mAP@0.5 | 48.4 | # 7 | ||
mAP@0.75 | 24.9 | # 7 | |||||
Weakly-supervised instance segmentation | PASCAL VOC 2012 val | LIID | mAP@0.25 | - | # 6 | ||
mAP@0.5 | 48.4 | # 5 | |||||
mAP@0.75 | 24.9 | # 5 | |||||
Average Best Overlap | 50.8 | # 2 | |||||
Weakly-Supervised Semantic Segmentation | PASCAL VOC 2012 val | LIID (ResNet-101) | Mean IoU | 66.5 | # 66 | ||
Weakly-Supervised Semantic Segmentation | PASCAL VOC 2012 val | LIID (ResNet-101, +24K SI) | Mean IoU | 67.8 | # 60 | ||
Weakly-Supervised Semantic Segmentation | PASCAL VOC 2012 val | LIID (Res2Net-101) | Mean IoU | 69.4 | # 49 |