TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Semi-Supervised Video Object Segmentation	DAVIS (no YouTube-VOS training)	SSTVOS	D17 val (G)	78.4	# 5
Semi-Supervised Video Object Segmentation	DAVIS (no YouTube-VOS training)	SSTVOS	D17 val (J)	75.4	# 5
Semi-Supervised Video Object Segmentation	DAVIS (no YouTube-VOS training)	SSTVOS	D17 val (F)	81.4	# 4
Video Object Segmentation	YouTube-VOS 2018	SST (Local)	Jaccard (Seen)	80.9	# 12
Video Object Segmentation	YouTube-VOS 2018	SST (Local)	Jaccard (Unseen)	76.6	# 6
Video Object Segmentation	YouTube-VOS 2019	SST	Mean Jaccard & F-Measure	81.8	# 9
Video Object Segmentation	YouTube-VOS 2019	SST	Jaccard (Seen)	80.9	# 9
Video Object Segmentation	YouTube-VOS 2019	SST	Jaccard (Unseen)	76.6	# 9

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/sstvos-sparse-spatiotemporal-transformers-for/semi-supervised-video-object-segmentation-on-20)](https://paperswithcode.com/sota/semi-supervised-video-object-segmentation-on-20?p=sstvos-sparse-spatiotemporal-transformers-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/sstvos-sparse-spatiotemporal-transformers-for/video-object-segmentation-on-youtube-vos-1)](https://paperswithcode.com/sota/video-object-segmentation-on-youtube-vos-1?p=sstvos-sparse-spatiotemporal-transformers-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/sstvos-sparse-spatiotemporal-transformers-for/video-object-segmentation-on-youtube-vos-2019-2)](https://paperswithcode.com/sota/video-object-segmentation-on-youtube-vos-2019-2?p=sstvos-sparse-spatiotemporal-transformers-for)`

SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation

CVPR 2021 · Brendan Duke, Abdalla Ahmed, Christian Wolf, Parham Aarabi, Graham W. Taylor ·

In this paper we introduce a Transformer-based approach to video object segmentation (VOS). To address compounding error and scalability issues of prior work, we propose a scalable, end-to-end method for VOS called Sparse Spatiotemporal Transformers (SST). SST extracts per-pixel representations for each object in a video using sparse attention over spatiotemporal features. Our attention-based formulation for VOS allows a model to learn to attend over a history of multiple frames and provides suitable inductive bias for performing correspondence-like computations necessary for solving motion segmentation. We demonstrate the effectiveness of attention-based over recurrent networks in the spatiotemporal domain. Our method achieves competitive results on YouTube-VOS and DAVIS 2017 with improved scalability and robustness to occlusions compared with the state of the art. Code is available at https://github.com/dukebw/SSTVOS.

PDF Abstract CVPR 2021 PDF CVPR 2021 Abstract

Code

Add Remove Mark official

dukebw/SSTVOS official

Tasks

Add Remove

Inductive Bias

Motion Segmentation

Object

One-shot visual object segmentation

Segmentation

Semantic Segmentation

Semi-Supervised Video Object Segmentation

Video Object Segmentation

Video Semantic Segmentation

Visual Object Tracking

Datasets

DAVIS

DAVIS 2017

DAVIS 2016

YouTube-VOS 2018

Results from the Paper

Edit

Ranked #5 on Semi-Supervised Video Object Segmentation on DAVIS (no YouTube-VOS training)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semi-Supervised Video Object Segmentation	DAVIS (no YouTube-VOS training)	SSTVOS	D17 val (G)	78.4	# 5	Compare
			D17 val (J)	75.4	# 5	Compare
			D17 val (F)	81.4	# 4	Compare
Video Object Segmentation	YouTube-VOS 2018	SST (Local)	Jaccard (Seen)	80.9	# 12	Compare
Video Object Segmentation	YouTube-VOS 2018	SST (Local)	Jaccard (Unseen)	76.6	# 6	Compare
Video Object Segmentation	YouTube-VOS 2019	SST	Mean Jaccard & F-Measure	81.8	# 9	Compare
			Jaccard (Seen)	80.9	# 9	Compare
			Jaccard (Unseen)	76.6	# 9	Compare

Methods

Add Remove

VOS

Edit Social Preview

SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove