Encoder-Decoder with Atrous Separable Convolution for Semantic Image Segmentation

Spatial pyramid pooling module or encode-decoder structure are used in deep neural networks for semantic segmentation task. The former networks are able to encode multi-scale contextual information by probing the incoming features with filters or pooling operations at multiple rates and multiple effective fields-of-view, while the latter networks can capture sharper object boundaries by gradually recovering the spatial information... (read more)

PDF Abstract ECCV 2018 PDF ECCV 2018 Abstract
TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK USES EXTRA
TRAINING DATA
RESULT BENCHMARK
Lesion Segmentation Anatomical Tracings of Lesions After Stroke (ATLAS) DeepLab v3+ Dice 0.4609 # 5
IoU 0.3458 # 4
Precision 0.5831 # 5
Recall 0.4491 # 5
Semantic Segmentation Cityscapes val DeepLabv3+ (Dilated-Xception-71) mIoU 79.6% # 9
Image Classification ImageNet Modified Aligned Xception Top 1 Accuracy 79.81% # 74
Top 5 Accuracy 94.83% # 47
Semantic Segmentation PASCAL VOC 2012 test DeepLabv3+ (Xception-JFT) Mean IoU 89.0% # 2
Semantic Segmentation PASCAL VOC 2012 test DeepLabv3+ (Xception-65-JFT) Mean IoU 89.0% # 2
Semantic Segmentation PASCAL VOC 2012 val DeepLabv3+ (Xception-JFT) mIoU 84.56% # 3
Semantic Segmentation SkyScapes-Dense DeepLabv3+ Mean IoU 38.20 # 2

Results from Other Papers


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK USES EXTRA
TRAINING DATA
SOURCE PAPER COMPARE
Semantic Segmentation Cityscapes test DeepLabv3+ (Xception-JFT) Mean IoU (class) 82.1% # 16

Methods used in the Paper