Attention-guided Unified Network for Panoptic Segmentation

This paper studies panoptic segmentation, a recently proposed task which segments foreground (FG) objects at the instance level as well as background (BG) contents at the semantic level. Existing methods mostly dealt with these two problems separately, but in this paper, we reveal the underlying relationship between them, in particular, FG objects provide complementary cues to assist BG understanding. Our approach, named the Attention-guided Unified Network (AUNet), is a unified framework with two branches for FG and BG segmentation simultaneously. Two sources of attentions are added to the BG branch, namely, RPN and FG segmentation mask to provide object-level and pixel-level attentions, respectively. Our approach is generalized to different backbones with consistent accuracy gain in both FG and BG segmentation, and also sets new state-of-the-arts both in the MS-COCO (46.5% PQ) and Cityscapes (59.0% PQ) benchmarks.

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Panoptic Segmentation Cityscapes val AUNet (ResNet-101-FPN) PQ 59.0 # 29
PQst 62.1 # 18
PQth 54.8 # 15
mIoU 75.6 # 27
AP 34.4 # 23
Panoptic Segmentation COCO test-dev AUNet (ResNext-152-FPN) PQ 46.5 # 24
PQst 32.5 # 27
PQth 55.8 # 14
Panoptic Segmentation COCO test-dev AUNet (ResNet-101-FPN) PQ 45.2 # 27
PQst 31.3 # 31
PQth 54.4 # 19
Panoptic Segmentation COCO test-dev AUNet (ResNet-152-FPN) PQ 45.5 # 26
PQst 31.6 # 29
PQth 54.7 # 18