Domain Adaptive Semantic Segmentation with Self-Supervised Depth Estimation

Domain adaptation for semantic segmentation aims to improve the model performance in the presence of a distribution shift between source and target domain. Leveraging the supervision from auxiliary tasks~(such as depth estimation) has the potential to heal this shift because many visual tasks are closely related to each other. However, such a supervision is not always available. In this work, we leverage the guidance from self-supervised depth estimation, which is available on both domains, to bridge the domain gap. On the one hand, we propose to explicitly learn the task feature correlation to strengthen the target semantic predictions with the help of target depth estimation. On the other hand, we use the depth prediction discrepancy from source and target depth decoders to approximate the pixel-wise adaptation difficulty. The adaptation difficulty, inferred from depth, is then used to refine the target semantic segmentation pseudo-labels. The proposed method can be easily implemented into existing segmentation frameworks. We demonstrate the effectiveness of our approach on the benchmark tasks SYNTHIA-to-Cityscapes and GTA-to-Cityscapes, on which we achieve the new state-of-the-art performance of $55.0\%$ and $56.6\%$, respectively. Our code is available at \url{https://qin.ee/corda}.

PDF Abstract ICCV 2021 PDF ICCV 2021 Abstract

Results from the Paper


Ranked #14 on Domain Adaptation on SYNTHIA-to-Cityscapes (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Synthetic-to-Real Translation GTAV-to-Cityscapes Labels CorDA mIoU 56.6 # 25
Domain Adaptation SYNTHIA-to-Cityscapes CorDA (ResNet-101) mIoU 55.0 # 14
Synthetic-to-Real Translation SYNTHIA-to-Cityscapes CorDA(ResNet-101) MIoU (13 classes) 62.8 # 14
MIoU (16 classes) 55.0 # 16

Methods


No methods listed for this paper. Add relevant methods here