The semi-supervised scenario assumes the user inputs a full mask of the object of interest in the first frame of a video sequence. Methods have to produce the segmentation mask for that object in the subsequent frames.
In this paper we illustrate how to perform both visual object tracking and semi-supervised video object segmentation, in real-time, with a single simple approach.
#3 best model for Visual Object Tracking on YouTube-VOS
This paper tackles the task of semi-supervised video object segmentation, i. e., the separation of an object from the background in a video, given the mask of the first frame.
Multiple object video object segmentation is a challenging task, specially for the zero-shot case, when no object mask is given at the initial frame and the model has to find the objects to be segmented along the sequence.
We validate our method on four benchmark sets that cover single and multiple object segmentation.
#3 best model for Visual Object Tracking on DAVIS 2016
Specifically, to integrate the insights of matching based and propagation based methods, we employ an encoder-decoder framework to learn pixel-level similarity and segmentation in an end-to-end manner.
We address semi-supervised video object segmentation, the task of automatically generating accurate and consistent pixel masks for objects in a video sequence, given the first-frame ground truth annotations.
Extensive experiments on challenging datasets demonstrate the effectiveness of the proposed method, especially in the case of object missing.
Semi-supervised video object segmentation has made significant progress on real and challenging videos in recent years.
In this work we propose a capsule-based approach for semi-supervised video object segmentation.