VOS is a type of video object segmentation model consisting of two network components. The target appearance model consists of a light-weight module, which is learned during the inference stage using fast optimization techniques to predict a coarse but robust target segmentation. The segmentation model is exclusively trained offline, designed to process the coarse scores into high quality segmentation masks.
Source: Learning Fast and Robust Target Models for Video Object SegmentationPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Video Object Segmentation | 84 | 21.65% |
Semantic Segmentation | 83 | 21.39% |
Video Semantic Segmentation | 82 | 21.13% |
Semi-Supervised Video Object Segmentation | 33 | 8.51% |
Optical Flow Estimation | 11 | 2.84% |
Unsupervised Video Object Segmentation | 7 | 1.80% |
One-shot visual object segmentation | 7 | 1.80% |
Visual Object Tracking | 6 | 1.55% |
Object Detection | 6 | 1.55% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |