Video Segmentation

70 papers with code • 1 benchmarks • 9 datasets

This task has no description! Would you like to contribute one?

Most implemented papers

One-Shot Video Object Segmentation

kmaninis/OSVOS-PyTorch CVPR 2017

This paper tackles the task of semi-supervised video object segmentation, i. e., the separation of an object from the background in a video, given the mask of the first frame.

YouTube-VOS: Sequence-to-Sequence Video Object Segmentation

BehradToghi/ECCV_Youtube_VOS ECCV 2018

End-to-end sequential learning to explore spatial-temporal features for video segmentation is largely limited by the scale of available video segmentation datasets, i. e., even the largest video segmentation dataset only contains 90 short video clips.

Mask2Former for Video Instance Segmentation

facebookresearch/Mask2Former 20 Dec 2021

We find Mask2Former also achieves state-of-the-art performance on video instance segmentation without modifying the architecture, the loss or even the training pipeline.

Video Object Segmentation with Re-identification

lxx1991/VS-ReID 1 Aug 2017

Specifically, our Video Object Segmentation with Re-identification (VS-ReID) model includes a mask propagation module and a ReID module.

CCNet: Criss-Cross Attention for Semantic Segmentation

speedinghzl/CCNet ICCV 2019

Compared with the non-local block, the proposed recurrent criss-cross attention module requires 11x less GPU memory usage.

Physarum Powered Differentiable Linear Programming Layers and Applications

HeatherJiaZG/SuperGlue-pytorch 30 Apr 2020

We describe our development and show the use of our solver in a video segmentation task and meta-learning for few-shot learning.

EPIC-KITCHENS VISOR Benchmark: VIdeo Segmentations and Object Relations

epic-kitchens/visor-hos 26 Sep 2022

VISOR annotates videos from EPIC-KITCHENS, which comes with a new set of challenges not encountered in current video segmentation datasets.

Rethinking the Evaluation of Video Summaries

mayu-ot/rethinking-evs CVPR 2019

Video summarization is a technique to create a short skim of the original video while preserving the main stories/content.

Temporal Aggregate Representations for Long-Range Video Understanding

dibschat/tempAgg ECCV 2020

Future prediction, especially in long-range videos, requires reasoning from current and past observations.

TSPNet: Hierarchical Feature Learning via Temporal Semantic Pyramid for Sign Language Translation

verashira/TSPNet NeurIPS 2020

Sign language translation (SLT) aims to interpret sign video sequences into text-based natural language sentences.