Semantic Segmentation Modules

Pyramid Pooling Module

Introduced by Zhao et al. in Pyramid Scene Parsing Network

A Pyramid Pooling Module is a module for semantic segmentation which acts as an effective global contextual prior. The motivation is that the problem of using a convolutional network like a ResNet is that, while the receptive field is already larger than the input image, the empirical receptive field is much smaller than the theoretical one especially on high-level layers. This makes many networks not sufficiently incorporate the momentous global scenery prior.

The PPM is an effective global prior representation that addresses this problem. It contains information with different scales and varying among different sub-regions. Using our 4-level pyramid, the pooling kernels cover the whole, half of, and small portions of the image. They are fused as the global prior. Then we concatenate the prior with the original feature map in the final part.

Source: Pyramid Scene Parsing Network


Paper Code Results Date Stars