VOS is a type of video object segmentation model consisting of two network components. The target appearance model consists of a light-weight module, which is learned during the inference stage using fast optimization techniques to predict a coarse but robust target segmentation. The segmentation model is exclusively trained offline, designed to process the coarse scores into high quality segmentation masks.
Source: Learning Fast and Robust Target Models for Video Object SegmentationPaper | Code | Results | Date | Stars |
---|
Task | Papers | Share |
---|---|---|
Video Object Segmentation | 102 | 18.85% |
Semantic Segmentation | 101 | 18.67% |
Video Semantic Segmentation | 100 | 18.48% |
Object | 67 | 12.38% |
Semi-Supervised Video Object Segmentation | 33 | 6.10% |
Optical Flow Estimation | 13 | 2.40% |
Unsupervised Video Object Segmentation | 8 | 1.48% |
Visual Object Tracking | 8 | 1.48% |
Object Detection | 7 | 1.29% |
Component | Type |
|
---|---|---|
🤖 No Components Found | You can add them if they exist; e.g. Mask R-CNN uses RoIAlign |