VOS is a type of video object segmentation model consisting of two network components. The target appearance model consists of a light-weight module, which is learned during the inference stage using fast optimization techniques to predict a coarse but robust target segmentation. The segmentation model is exclusively trained offline, designed to process the coarse scores into high quality segmentation masks.
Source: Learning Fast and Robust Target Models for Video Object SegmentationPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Video Object Segmentation | 47 | 21.17% |
Semantic Segmentation | 45 | 20.27% |
Video Semantic Segmentation | 45 | 20.27% |
Semi-Supervised Video Object Segmentation | 25 | 11.26% |
Optical Flow Estimation | 7 | 3.15% |
One-shot visual object segmentation | 7 | 3.15% |
Unsupervised Video Object Segmentation | 6 | 2.70% |
Object Detection | 6 | 2.70% |
Visual Object Tracking | 4 | 1.80% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |