Video Semantic Segmentation

326 papers with code • 5 benchmarks • 8 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Video Semantic Segmentation

Dataset	Best Model	Compare
Cityscapes val	TMANet-50	See all
CamVid	TMANet-50	See all
VSPW	DVIS++(VIT-L)	See all
LaRS	WaSR-T (ResNet-101)	See all
Multispectral Video Semantic Segmentation	MVNet(DeepLabV3)	See all

Libraries

Use these libraries to find Video Semantic Segmentation models and implementations

yoxu515/aot-benchmark

4 papers

564

PaddlePaddle/PaddleSeg

3 papers

8,267

visionml/pytracking

3 papers

3,092

hkchengrex/Mask-Propagation

3 papers

124

See all 9 libraries.

Datasets

Subtasks

Camera shot segmentation

Latest papers

Most implemented Social Latest No code

RAP-SAM: Towards Real-Time All-Purpose Segment Anything

xushilin1/rap-sam • • 18 Jan 2024

Segment Anything Model (SAM) is one remarkable model that can achieve generalized segmentation.

190

18 Jan 2024

Paper
Code

1st Place Solution for 5th LSVOS Challenge: Referring Video Object Segmentation

robertluo1/iccv2023_rvos_challenge • • 1 Jan 2024

The recent transformer-based models have dominated the Referring Video Object Segmentation (RVOS) task due to the superior performance.

01 Jan 2024

Paper
Code

Tracking with Human-Intent Reasoning

jiawen-zhu/trackgpt • • 29 Dec 2023

The perception component then generates the tracking results based on the embeddings.

29 Dec 2023

Paper
Code

UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces

foundationvision/uniref • • 25 Dec 2023

We evaluate our unified models on various benchmarks.

223

25 Dec 2023

Paper
Code

DVIS++: Improved Decoupled Framework for Universal Video Segmentation

zhang-tao-whu/DVIS_Plus • • 20 Dec 2023

We present the \textbf{D}ecoupled \textbf{VI}deo \textbf{S}egmentation (DVIS) framework, a novel approach for the challenging task of universal video segmentation, including video instance segmentation (VIS), video semantic segmentation (VSS), and video panoptic segmentation (VPS).

20 Dec 2023

Paper
Code

AutoVisual Fusion Suite: A Comprehensive Evaluation of Image Segmentation and Voice Conversion Tools on HuggingFace Platform

amirrezahmi/video-inpainting-and-voice-cloning • 17 Dec 2023

This study presents a comprehensive evaluation of tools available on the HuggingFace platform for two pivotal applications in artificial intelligence: image segmentation and voice conversion.

17 Dec 2023

Paper
Code

Hierarchical Graph Pattern Understanding for Zero-Shot VOS

nust-machine-intelligence-laboratory/hgpu • • 15 Dec 2023

However, existing optical flow-based methods have a significant dependency on optical flow, which results in poor performance when the optical flow estimation fails for a particular scene.

15 Dec 2023

Paper
Code

Semi-supervised Active Learning for Video Action Detection

akash2907/semi-sup-active-learning • • 12 Dec 2023

First, we demonstrate its effectiveness on video action detection where the proposed approach outperforms prior works in semi-supervised and weakly-supervised learning along with several baseline approaches in both UCF101-24 and JHMDB-21.

12 Dec 2023

Paper
Code

Flexible visual prompts for in-context learning in computer vision

v7labs/xmem_icl • 11 Dec 2023

Additionally, we propose a technique for support set selection, which involves choosing the most relevant images to include in this set.

11 Dec 2023

Paper
Code

Efficient Multimodal Semantic Segmentation via Dual-Prompt Learning

shaohuadong2021/dplnet • 1 Dec 2023

Existing approaches often fully fine-tune a dual-branch encoder-decoder framework with a complicated feature fusion strategy for achieving multimodal semantic segmentation, which is training-costly due to the massive parameter updates in feature extraction and fusion.

01 Dec 2023

Paper
Code

Video Semantic Segmentation

Benchmarks Add a Result

Libraries

Datasets

Subtasks

Latest papers

Content

Benchmarks

Add a Result