Gated-SCNN: Gated Shape CNNs for Semantic Segmentation

Current state-of-the-art methods for image segmentation form a dense image representation where the color, shape and texture information are all processed together inside a deep CNN. This however may not be ideal as they contain very different type of information relevant for recognition. Here, we propose a new two-stream CNN architecture for semantic segmentation that explicitly wires shape information as a separate processing branch, i.e. shape stream, that processes information in parallel to the classical stream. Key to this architecture is a new type of gates that connect the intermediate layers of the two streams. Specifically, we use the higher-level activations in the classical stream to gate the lower-level activations in the shape stream, effectively removing noise and helping the shape stream to only focus on processing the relevant boundary-related information. This enables us to use a very shallow architecture for the shape stream that operates on the image-level resolution. Our experiments show that this leads to a highly effective architecture that produces sharper predictions around object boundaries and significantly boosts performance on thinner and smaller objects. Our method achieves state-of-the-art performance on the Cityscapes benchmark, in terms of both mask (mIoU) and boundary (F-score) quality, improving by 2% and 4% over strong baselines.

PDF Abstract ICCV 2019 PDF ICCV 2019 Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Semantic Segmentation Cityscapes test Gated-SCNN Mean IoU (class) 82.8% # 24
Semantic Segmentation Cityscapes val GSCNN (ResNet-101) mIoU 74.7% # 63
Semantic Segmentation Cityscapes val GSCNN (ResNet-50) mIoU 73.0% # 68

Methods