Video Object Segmentation
191 papers with code • 9 benchmarks • 16 datasets
Video object segmentation is a binary labeling problem aiming to separate foreground object(s) from the background region of a video.
For leaderboards please refer to the different subtasks.
These leaderboards are used to track progress in Video Object Segmentation
LibrariesUse these libraries to find Video Object Segmentation models and implementations
Most implemented papers
Emerging Properties in Self-Supervised Vision Transformers
In this paper, we question if self-supervised learning provides new properties to Vision Transformer (ViT) that stand out compared to convolutional networks (convnets).
PReMVOS: Proposal-generation, Refinement and Merging for Video Object Segmentation
We address semi-supervised video object segmentation, the task of automatically generating accurate and consistent pixel masks for objects in a video sequence, given the first-frame ground truth annotations.
Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion
We present Modular interactive VOS (MiVOS) framework which decouples interaction-to-mask and mask propagation, allowing for higher generalizability and better performance.
Rethinking Self-supervised Correspondence Learning: A Video Frame-level Similarity Perspective
To learn generalizable representation for correspondence in large-scale, a variety of self-supervised pretext tasks are proposed to explicitly perform object-level or patch-level similarity learning.
One-Shot Video Object Segmentation
This paper tackles the task of semi-supervised video object segmentation, i. e., the separation of an object from the background in a video, given the mask of the first frame.
Lucid Data Dreaming for Video Object Segmentation
Our approach is suitable for both single and multiple object segmentation.
YouTube-VOS: Sequence-to-Sequence Video Object Segmentation
End-to-end sequential learning to explore spatial-temporal features for video segmentation is largely limited by the scale of available video segmentation datasets, i. e., even the largest video segmentation dataset only contains 90 short video clips.
Interactive Video Object Segmentation Using Global and Local Transfer Modules
The global transfer module conveys the segmentation information in an annotated frame to a target frame, while the local transfer module propagates the segmentation information in a temporally adjacent frame to the target frame.
Video Polyp Segmentation: A Deep Learning Perspective
We present the first comprehensive video polyp segmentation (VPS) study in the deep learning era.
Video Object Segmentation with Re-identification
Specifically, our Video Object Segmentation with Re-identification (VS-ReID) model includes a mask propagation module and a ReID module.