Semi-Supervised Video Object Segmentation
94 papers with code • 15 benchmarks • 13 datasets
The semi-supervised scenario assumes the user inputs a full mask of the object(s) of interest in the first frame of a video sequence. Methods have to produce the segmentation mask for that object(s) in the subsequent frames.
Libraries
Use these libraries to find Semi-Supervised Video Object Segmentation models and implementationsDatasets
Latest papers
CLVOS23: A Long Video Object Segmentation Dataset for Continual Learning
Continual learning in real-world scenarios is a major challenge.
Learning to Learn Better for Video Object Segmentation
Recently, the joint learning framework (JOINT) integrates matching based transductive reasoning and online inductive learning to achieve accurate and robust semi-supervised video object segmentation (SVOS).
Decoupling Features in Hierarchical Propagation for Video Object Segmentation
To solve such a problem and further facilitate the learning of visual embeddings, this paper proposes a Decoupling Features in Hierarchical Propagation (DeAOT) approach.
Global Spectral Filter Memory Network for Video Object Segmentation
Besides, we empirically find low frequency feature should be enhanced in encoder (backbone) while high frequency for decoder (segmentation head).
SWEM: Towards Real-Time Video Object Segmentation with Sequential Weighted Expectation-Maximization
Matching-based methods, especially those based on space-time memory, are significantly ahead of other solutions in semi-supervised video object segmentation (VOS).
Per-Clip Video Object Segmentation
In this per-clip inference scheme, we update the memory with an interval and simultaneously process a set of consecutive frames (i. e. clip) between the memory updates.
Learning Quality-aware Dynamic Memory for Video Object Segmentation
However, they mainly focus on better matching between the current frame and the memory frames without explicitly paying attention to the quality of the memory.
XMem: Long-Term Video Object Segmentation with an Atkinson-Shiffrin Memory Model
We present XMem, a video object segmentation architecture for long videos with unified feature memory stores inspired by the Atkinson-Shiffrin memory model.
Tackling Background Distraction in Video Object Segmentation
Semi-supervised video object segmentation (VOS) aims to densely track certain designated objects in videos.
Towards Robust Video Object Segmentation with Adaptive Object Calibration
We consolidate this conditional mask calibration process in a progressive manner, where the object representations and proto-masks evolve to be discriminative iteratively.