Triply Supervised Decoder Networks for Joint Detection and Segmentation

CVPR 2019  ·  Jiale Cao, Yanwei Pang, Xuelong. Li ·

Joint object detection and semantic segmentation can be applied to many fields, such as self-driving cars and unmanned surface vessels. An initial and important progress towards this goal has been achieved by simply sharing the deep convolutional features for the two tasks. However, this simple scheme is unable to make full use of the fact that detection and segmentation are mutually beneficial. To overcome this drawback, we propose a framework called TripleNet where triple supervisions including detection-oriented supervision, class-aware segmentation supervision, and class-agnostic segmentation supervision are imposed on each layer of the decoder network. Class-agnostic segmentation supervision provides an objectness prior knowledge for both semantic segmentation and object detection. Besides the three types of supervisions, two light-weight modules (i.e., inner-connected module and attention skip-layer fusion) are also incorporated into each layer of the decoder. In the proposed framework, detection and segmentation can sufficiently boost each other. Moreover, class-agnostic and class-aware segmentation on each decoder layer are not performed at the test stage. Therefore, no extra computational costs are introduced at the test stage. Experimental results on the VOC2007 and VOC2012 datasets demonstrate that the proposed TripleNet is able to improve both the detection and segmentation accuracies without adding extra computational costs.

PDF Abstract CVPR 2019 PDF CVPR 2019 Abstract

Datasets


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Semantic Segmentation PASCAL VOC 2012 test TripleNet Mean IoU 83.3% # 26

Methods


No methods listed for this paper. Add relevant methods here