Semi-Supervised Video Object Segmentation
94 papers with code • 15 benchmarks • 13 datasets
The semi-supervised scenario assumes the user inputs a full mask of the object(s) of interest in the first frame of a video sequence. Methods have to produce the segmentation mask for that object(s) in the subsequent frames.
Libraries
Use these libraries to find Semi-Supervised Video Object Segmentation models and implementationsDatasets
Latest papers
Efficient Video Object Segmentation via Modulated Cross-Attention Memory
Recently, transformer-based approaches have shown promising results for semi-supervised video object segmentation.
Video Object Segmentation with Dynamic Query Modulation
Storing intermediate frame segmentations as memory for long-range context modeling, spatial-temporal memory-based methods have recently showcased impressive results in semi-supervised video object segmentation (SVOS).
Lester: rotoscope animation through video object segmentation and tracking
This article introduces Lester, a novel method to automatically synthetise retro-style 2D animations from videos.
ODTrack: Online Dense Temporal Token Learning for Visual Tracking
To alleviate the above problem, we propose a simple, flexible and effective video-level tracking pipeline, named \textbf{ODTrack}, which densely associates the contextual relationships of video frames in an online token propagation manner.
Putting the Object Back into Video Object Segmentation
We present Cutie, a video object segmentation (VOS) network with object-level memory reading, which puts the object representation from memory back into the video object segmentation result.
Tracking Anything with Decoupled Video Segmentation
To 'track anything' without training on video data for every individual task, we develop a decoupled video segmentation approach (DEVA), composed of task-specific image-level segmentation and class/task-agnostic bi-directional temporal propagation.
XMem++: Production-level Video Segmentation From Few Annotated Frames
Despite advancements in user-guided video segmentation, extracting complex objects consistently for highly complex scenes is still a labor-intensive task, especially for production.
Tracking Anything in High Quality
To further improve the quality of tracking masks, a pretrained MR model is employed to refine the tracking results.
READMem: Robust Embedding Association for a Diverse Memory in Unconstrained Video Object Segmentation
We present READMem (Robust Embedding Association for a Diverse Memory), a modular framework for semi-automatic video object segmentation (sVOS) methods designed to handle unconstrained videos.
Video Object Segmentation in Panoptic Wild Scenes
Considering the challenges in panoptic VOS, we propose a strong baseline method named panoptic object association with transformers (PAOT), which uses panoptic identification to associate objects with a pyramid architecture on multiple scales.