|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
This allows us to achieve a rich internal representation of the target in the current frame, significantly increasing the segmentation accuracy of our approach.
In our framework, the past frames with object masks form an external memory, and the current frame as the query is segmented using the mask information in the memory.
This paper investigates the principles of embedding learning to tackle the challenging semi-supervised video object segmentation.
Multiple object video object segmentation is a challenging task, specially for the zero-shot case, when no object mask is given at the initial frame and the model has to find the objects to be segmented along the sequence.
Ranked #1 on Youtube-VOS on YouTube-VOS
We address semi-supervised video object segmentation, the task of automatically generating accurate and consistent pixel masks for objects in a video sequence, given the first-frame ground truth annotations.
Video object segmentation (VOS) aims at pixel-level object tracking given only the annotations in the first frame.
Ranked #1 on Visual Object Tracking on YouTube-VOS (Jaccard (Seen) metric)
The target appearance model consists of a light-weight module, which is learned during the inference stage using fast optimization techniques to predict a coarse but robust target segmentation.