The semi-supervised scenario assumes the user inputs a full mask of the object of interest in the first frame of a video sequence. Methods have to produce the segmentation mask for that object in the subsequent frames.
In this paper we illustrate how to perform both visual object tracking and semi-supervised video object segmentation, in real-time, with a single simple approach.
#4 best model for Visual Object Tracking on VOT2017/18
Multiple object video object segmentation is a challenging task, specially for the zero-shot case, when no object mask is given at the initial frame and the model has to find the objects to be segmented along the sequence.
We validate our method on four benchmark sets that cover single and multiple object segmentation.
We address semi-supervised video object segmentation, the task of automatically generating accurate and consistent pixel masks for objects in a video sequence, given the first-frame ground truth annotations.
Extensive experiments on challenging datasets demonstrate the effectiveness of the proposed method, especially in the case of object missing.
Semi-supervised video object segmentation has made significant progress on real and challenging videos in recent years.