Unsupervised Video Object Segmentation with Motion-based Bilateral Networks
In this work, we study the unsupervised video object segmentation problem where moving objects are segmented without prior knowledge of these objects. First, we propose a motion-based bilateral network to estimate the background based on the motion pattern of non-object regions. The bilateral network reduces false positive regions by accurately identifying background objects. Then, we integrate the background estimate from the bilateral network with instance embeddings into a graph, which allows multiple frame reasoning with graph edges linking pixels from different frames. We classify graph nodes by defining and minimizing a cost function, and segment the video frames based on the node labels. The proposed method outperforms previous state-of-the-art unsupervised video object segmentation methods against the DAVIS 2016 and the FBMS-59 datasets.
PDF AbstractDatasets
Results from the Paper
Ranked #3 on Video Salient Object Detection on MCL (using extra training data)
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Uses Extra Training Data |
Benchmark |
---|---|---|---|---|---|---|---|
Video Salient Object Detection | DAVIS-2016 | MBNM | S-Measure | 0.887 | # 3 | ||
MAX E-MEASURE | 0.966 | # 1 | |||||
MAX F-MEASURE | 0.862 | # 2 | |||||
AVERAGE MAE | 0.031 | # 6 | |||||
Video Salient Object Detection | DAVSOD-easy35 | MBNM | S-Measure | 0.646 | # 4 | ||
max F-Measure | 0.506 | # 4 | |||||
max E-Measure | 0.694 | # 4 | |||||
Average MAE | 0.109 | # 3 | |||||
Video Salient Object Detection | DAVSOD-Normal25 | MBNM | S-Measure | 0.597 | # 4 | ||
max E-measure | 0.665 | # 4 | |||||
Average MAE | 0.127 | # 3 | |||||
Video Salient Object Detection | FBMS-59 | MBNM | S-Measure | 0.857 | # 4 | ||
AVERAGE MAE | 0.047 | # 3 | |||||
MAX E-MEASURE | 0.892 | # 2 | |||||
MAX F-MEASURE | 0.816 | # 5 | |||||
Video Salient Object Detection | MCL | MBNM | S-Measure | 0.755 | # 3 | ||
MAX E-MEASURE | 0.858 | # 3 | |||||
MAX F-MEASURE | 0.698 | # 2 | |||||
AVERAGE MAE | 0.119 | # 3 | |||||
Video Salient Object Detection | SegTrack v2 | MBNM | S-Measure | 0.809 | # 4 | ||
MAX F-MEASURE | 0.716 | # 3 | |||||
AVERAGE MAE | 0.026 | # 4 | |||||
max E-measure | 0.878 | # 3 | |||||
Video Salient Object Detection | UVSD | MBNM | S-Measure | 0.698 | # 4 | ||
max E-measure | 0.776 | # 4 | |||||
Average MAE | 0.079 | # 4 | |||||
Video Salient Object Detection | ViSal | MBNM | S-Measure | 0.857 | # 5 | ||
max E-measure | 0.892 | # 4 | |||||
Average MAE | 0.047 | # 5 | |||||
Video Salient Object Detection | VOS-T | MBNM | S-Measure | 0.742 | # 4 | ||
max E-measure | 0.797 | # 4 | |||||
Average MAE | 0.099 | # 5 |