Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

Recently, Neural Architecture Search (NAS) has successfully identified neural network architectures that exceed human designed ones on large-scale image classification. In this paper, we study NAS for semantic image segmentation. Existing works often focus on searching the repeatable cell structure, while hand-designing the outer network structure that controls the spatial resolution changes. This choice simplifies the search space, but becomes increasingly problematic for dense image prediction which exhibits a lot more network level architectural variations. Therefore, we propose to search the network level structure in addition to the cell level structure, which forms a hierarchical architecture search space. We present a network level search space that includes many popular designs, and develop a formulation that allows efficient gradient-based architecture search (3 P100 GPU days on Cityscapes images). We demonstrate the effectiveness of the proposed method on the challenging Cityscapes, PASCAL VOC 2012, and ADE20K datasets. Auto-DeepLab, our architecture searched specifically for semantic image segmentation, attains state-of-the-art performance without any ImageNet pretraining.

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Semantic Segmentation ADE20K Auto-DeepLab-L Validation mIoU 43.98 # 197
Semantic Segmentation ADE20K val Auto-DeepLab-L mIoU 43.98 # 82
Pixel Accuracy 81.72 # 8
Semantic Segmentation Cityscapes test Auto-DeepLab-L Mean IoU (class) 82.1% # 31
Semantic Segmentation Cityscapes val Auto-DeepLab-L mIoU 80.33% # 44
Semantic Segmentation PASCAL VOC 2012 test Auto-DeepLab-L Mean IoU 85.6% # 9
Semantic Segmentation PASCAL VOC 2012 val Auto-DeepLab-L mIoU 82.04% # 7

Methods