70 papers with code • 1 benchmarks • 9 datasets
These leaderboards are used to track progress in Video Segmentation
Most implemented papers
One-Shot Video Object Segmentation
This paper tackles the task of semi-supervised video object segmentation, i. e., the separation of an object from the background in a video, given the mask of the first frame.
YouTube-VOS: Sequence-to-Sequence Video Object Segmentation
End-to-end sequential learning to explore spatial-temporal features for video segmentation is largely limited by the scale of available video segmentation datasets, i. e., even the largest video segmentation dataset only contains 90 short video clips.
Mask2Former for Video Instance Segmentation
We find Mask2Former also achieves state-of-the-art performance on video instance segmentation without modifying the architecture, the loss or even the training pipeline.
Video Object Segmentation with Re-identification
Specifically, our Video Object Segmentation with Re-identification (VS-ReID) model includes a mask propagation module and a ReID module.
CCNet: Criss-Cross Attention for Semantic Segmentation
Compared with the non-local block, the proposed recurrent criss-cross attention module requires 11x less GPU memory usage.
Physarum Powered Differentiable Linear Programming Layers and Applications
We describe our development and show the use of our solver in a video segmentation task and meta-learning for few-shot learning.
EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations
VISOR annotates videos from EPIC-KITCHENS, which comes with a new set of challenges not encountered in current video segmentation datasets.
Rethinking the Evaluation of Video Summaries
Video summarization is a technique to create a short skim of the original video while preserving the main stories/content.
Temporal Aggregate Representations for Long-Range Video Understanding
Future prediction, especially in long-range videos, requires reasoning from current and past observations.
TSPNet: Hierarchical Feature Learning via Temporal Semantic Pyramid for Sign Language Translation
Sign language translation (SLT) aims to interpret sign video sequences into text-based natural language sentences.