Structured Knowledge Distillation for Semantic Segmentation

In this paper, we investigate the issue of knowledge distillation for training compact semantic segmentation networks by making use of cumbersome networks. We start from the straightforward scheme, pixel-wise distillation, which applies the distillation scheme originally introduced for image classification and performs knowledge distillation for each pixel separately. We further propose to distill the structured knowledge from cumbersome networks into compact networks, which is motivated by the fact that semantic segmentation is a structured prediction problem. We study two such structured distillation schemes: (i) pair-wise distillation that distills the pairwise similarities, and (ii) holistic distillation that uses adversarial training to distill holistic knowledge. The effectiveness of our knowledge distillation approaches is demonstrated by extensive experiments on three scene parsing datasets: Cityscapes, Camvid and ADE20K.

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods