TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Video Instance Segmentation	OVIS validation	RefineVIS (Swin-L, offline)	mask AP	46	# 8
Video Instance Segmentation	OVIS validation	RefineVIS (Swin-L, offline)	AP50	70.4	# 7
Video Instance Segmentation	OVIS validation	RefineVIS (Swin-L, offline)	AP75	48.4	# 7
Video Instance Segmentation	OVIS validation	RefineVIS (Swin-L, offline)	AR1	19.1	# 7
Video Instance Segmentation	OVIS validation	RefineVIS (Swin-L, offline)	AR10	51.2	# 5
Video Instance Segmentation	YouTube-VIS 2021	RefineVIS (Swin-L, online)	mask AP	61.4	# 3
Video Instance Segmentation	YouTube-VIS 2021	RefineVIS (Swin-L, online)	AP50	84.1	# 2
Video Instance Segmentation	YouTube-VIS 2021	RefineVIS (Swin-L, online)	AP75	68.5	# 3
Video Instance Segmentation	YouTube-VIS 2021	RefineVIS (Swin-L, online)	AR10	65.2	# 4
Video Instance Segmentation	YouTube-VIS 2021	RefineVIS (Swin-L, online)	AR1	48.3	# 5
Video Instance Segmentation	YouTube-VIS validation	RefineVIS (Swin-L, offline)	mask AP	64.4	# 8
Video Instance Segmentation	YouTube-VIS validation	RefineVIS (Swin-L, offline)	AP50	88.3	# 3
Video Instance Segmentation	YouTube-VIS validation	RefineVIS (Swin-L, offline)	AP75	72.2	# 5
Video Instance Segmentation	YouTube-VIS validation	RefineVIS (Swin-L, offline)	AR1	55.8	# 8
Video Instance Segmentation	YouTube-VIS validation	RefineVIS (Swin-L, offline)	AR10	68.4	# 8

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/refinevis-video-instance-segmentation-with/video-instance-segmentation-on-youtube-vis-2)](https://paperswithcode.com/sota/video-instance-segmentation-on-youtube-vis-2?p=refinevis-video-instance-segmentation-with)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/refinevis-video-instance-segmentation-with/video-instance-segmentation-on-ovis-1)](https://paperswithcode.com/sota/video-instance-segmentation-on-ovis-1?p=refinevis-video-instance-segmentation-with)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/refinevis-video-instance-segmentation-with/video-instance-segmentation-on-youtube-vis-1)](https://paperswithcode.com/sota/video-instance-segmentation-on-youtube-vis-1?p=refinevis-video-instance-segmentation-with)`

RefineVIS: Video Instance Segmentation with Temporal Attention Refinement

7 Jun 2023 · Andre Abrantes, Jiang Wang, Peng Chu, Quanzeng You, Zicheng Liu ·

We introduce a novel framework called RefineVIS for Video Instance Segmentation (VIS) that achieves good object association between frames and accurate segmentation masks by iteratively refining the representations using sequence context. RefineVIS learns two separate representations on top of an off-the-shelf frame-level image instance segmentation model: an association representation responsible for associating objects across frames and a segmentation representation that produces accurate segmentation masks. Contrastive learning is utilized to learn temporally stable association representations. A Temporal Attention Refinement (TAR) module learns discriminative segmentation representations by exploiting temporal relationships and a novel temporal contrastive denoising technique. Our method supports both online and offline inference. It achieves state-of-the-art video instance segmentation accuracy on YouTube-VIS 2019 (64.4 AP), Youtube-VIS 2021 (61.4 AP), and OVIS (46.1 AP) datasets. The visualization shows that the TAR module can generate more accurate instance segmentation masks, particularly for challenging cases such as highly occluded objects.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Contrastive Learning

Denoising

Instance Segmentation

Segmentation

TAR

Video Instance Segmentation

Datasets

MS COCO

YouTube-VIS 2019

OVIS YouTube-VIS 2021

Results from the Paper

Add Remove

Ranked #3 on Video Instance Segmentation on YouTube-VIS 2021 (using extra training data)

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Video Instance Segmentation	OVIS validation	RefineVIS (Swin-L, offline)	mask AP	46	# 8	Compare
			AP50	70.4	# 7	Compare
			AP75	48.4	# 7	Compare
			AR1	19.1	# 7	Compare
			AR10	51.2	# 5	Compare
Video Instance Segmentation	YouTube-VIS 2021	RefineVIS (Swin-L, online)	mask AP	61.4	# 3	Compare
			AP50	84.1	# 2	Compare
			AP75	68.5	# 3	Compare
			AR10	65.2	# 4	Compare
			AR1	48.3	# 5	Compare
Video Instance Segmentation	YouTube-VIS validation	RefineVIS (Swin-L, offline)	mask AP	64.4	# 8	Compare
			AP50	88.3	# 3	Compare
			AP75	72.2	# 5	Compare
			AR1	55.8	# 8	Compare
			AR10	68.4	# 8	Compare

Methods

Add Remove

Contrastive Learning • Temporal attention

Edit Social Preview

RefineVIS: Video Instance Segmentation with Temporal Attention Refinement

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove