Separable Structure Modeling for Semi-supervised Video Object Segmentation

18 Feb 2021  ·  Wencheng Zhu, Jiahao Li, Jiwen Lu, Jie zhou ·

In this paper, we propose a separable structure modeling approach for semi-supervised video object segmentation. Unlike most existing methods which preclude the semantically structural information of target objects, our method not only captures pixel-level similarity relationships between the reference and target frames but also reveals the separable structure of the specified objects in target frames. Specifically, we first compute a pixel-wise similarity matrix by using representations of reference and target pixels and then select top-rank reference pixels for target pixel classification. According to the prior knowledge from these top-rank reference pixels, we further appoint the representative target pixels for object structure modeling. Particularly, in the structure modeling branch, we extract the shared and individual features that can well represent the whole object and its components, respectively. Moreover, the proposed method is a fast algorithm without online fine-tuning and any post-processing. We conduct extensive experiments and ablation studies on the DAVIS-16, DAVIS-17, and YouTube-VOS datasets, and experimental results on three widely-used datasets demonstrate that our method achieves superior performance, compared with state-of-the-art semi-supervised video object segmentation approaches in terms of speed and accuracy.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Semi-Supervised Video Object Segmentation DAVIS 2016 SSM-VOS Jaccard (Mean) 86.2 # 46
Jaccard (Recall) 97.1 # 6
Jaccard (Decay) 5.3 # 25
F-measure (Mean) 85.6 # 49
F-measure (Recall) 92.3 # 13
F-measure (Decay) 5.6 # 24
J&F 85.9 # 49
Speed (FPS) 36.5 # 11
Semi-Supervised Video Object Segmentation DAVIS 2017 (test-dev) SSM-VOS J&F 62.0 # 45
Jaccard (Mean) 60.2 # 45
Jaccard (Decay) 23.5 # 17
F-measure (Mean) 63.8 # 45
F-measure (Decay) 25.3 # 17
Semi-Supervised Video Object Segmentation DAVIS 2017 (val) SSM-VOS Jaccard (Mean) 75.3 # 44
Jaccard (Decay) 11.7 # 4
F-measure (Mean) 79.9 # 48
F-measure (Decay) 15.3 # 3
J&F 77.6 # 47
Speed (FPS) 22.3 # 19
Semi-Supervised Video Object Segmentation YouTube-VOS 2018 SSM-VOS F-Measure (Seen) 73.3 # 45
F-Measure (Unseen) 62.6 # 46
Overall 66.5 # 47
Speed (FPS) 24.1 # 9
Jaccard (Seen) 72.3 # 42
Jaccard (Unseen) 57.8 # 44

Methods


No methods listed for this paper. Add relevant methods here