TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=1)	Jaccard (Mean)	89.6	# 24
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=1)	F-measure (Mean)	90.9	# 32
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=1)	J&F	90.3	# 30
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=1)	Speed (FPS)	37.4	# 10
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOTv2-L (MS)	Jaccard (Mean)	91.6	# 4
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOTv2-L (MS)	F-measure (Mean)	94.4	# 3
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOTv2-L (MS)	J&F	93.0	# 3
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOTv2-L (MS)	Speed (FPS)	1.3	# 31
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOST (L'=3, MS)	Jaccard (Mean)	91.5	# 5
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOST (L'=3, MS)	F-measure (Mean)	94.5	# 2
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOST (L'=3, MS)	J&F	93.0	# 3
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOST (L'=3, MS)	Speed (FPS)	1.3	# 31
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=3)	Jaccard (Mean)	90.6	# 11
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=3)	F-measure (Mean)	93.6	# 11
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=3)	J&F	92.1	# 10
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=3)	Speed (FPS)	17.5	# 25
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOST (L'=3)	Jaccard (Mean)	90.5	# 13
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOST (L'=3)	F-measure (Mean)	94.2	# 5
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOST (L'=3)	J&F	92.4	# 7
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOST (L'=3)	Speed (FPS)	12.0	# 29
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOTv2-L	Jaccard (Mean)	90.6	# 11
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOTv2-L	F-measure (Mean)	94.1	# 7
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOTv2-L	J&F	92.4	# 7
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOTv2-L	Speed (FPS)	12.0	# 29
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=2)	Jaccard (Mean)	90.5	# 13
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=2)	F-measure (Mean)	93.4	# 13
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=2)	J&F	92.0	# 11
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=2)	Speed (FPS)	24.3	# 21
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	R50-AOST (L'=3)	J&F	79.9	# 18
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	R50-AOST (L'=3)	Jaccard (Mean)	76.2	# 20
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	R50-AOST (L'=3)	F-measure (Mean)	83.6	# 18
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	R50-AOST (L'=3)	FPS	17.5	# 16
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOST (L'=3, MS)	J&F	84.7	# 4
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOST (L'=3, MS)	Jaccard (Mean)	80.9	# 5
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOST (L'=3, MS)	F-measure (Mean)	88.5	# 4
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOST (L'=3, MS)	FPS	1.3	# 20
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	R50-AOST (L'=2)	J&F	78.1	# 25
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	R50-AOST (L'=2)	Jaccard (Mean)	74.5	# 24
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	R50-AOST (L'=2)	F-measure (Mean)	81.7	# 25
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	R50-AOST (L'=2)	FPS	24.3	# 11
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOST (L'=3)	J&F	82.7	# 10
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOST (L'=3)	Jaccard (Mean)	78.8	# 11
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOST (L'=3)	F-measure (Mean)	86.6	# 9
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOST (L'=3)	FPS	12.0	# 19
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOTv2-L	J&F	84.5	# 5
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOTv2-L	Jaccard (Mean)	81.0	# 4
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOTv2-L	F-measure (Mean)	87.9	# 5
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOTv2-L	FPS	1.3	# 20
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOTv2-L (MS)	Jaccard (Mean)	84.2	# 9
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOTv2-L (MS)	F-measure (Mean)	89.8	# 10
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOTv2-L (MS)	J&F	87.0	# 11
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOTv2-L (MS)	Speed (FPS)	1.3	# 29
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOTv2-L (MS)	Params(M)	65.6	# 19
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=1)	Jaccard (Mean)	81.2	# 29
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=1)	F-measure (Mean)	86.1	# 31
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=1)	J&F	83.7	# 30
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=1)	Speed (FPS)	37.4	# 9
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=1)	Params(M)	12.5	# 9
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=2)	Jaccard (Mean)	82.5	# 18
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=2)	F-measure (Mean)	88.0	# 23
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=2)	J&F	85.3	# 20
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=2)	Speed (FPS)	24.3	# 14
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=2)	Params(M)	13.9	# 12
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=3)	Jaccard (Mean)	82.6	# 17
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=3)	F-measure (Mean)	88.5	# 19
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=3)	J&F	85.6	# 17
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=3)	Speed (FPS)	17.5	# 24
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=3)	Params(M)	15.4	# 14
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOST (L'=3, MS)	Jaccard (Mean)	83.8	# 12
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOST (L'=3, MS)	F-measure (Mean)	89.5	# 11
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOST (L'=3, MS)	J&F	86.7	# 12
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOST (L'=3, MS)	Speed (FPS)	1.3	# 29
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOST (L'=3, MS)	Params(M)	65.6	# 19
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOTv2-L	Jaccard (Mean)	83.1	# 13
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOTv2-L	F-measure (Mean)	89.4	# 13
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOTv2-L	J&F	86.3	# 13
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOTv2-L	Speed (FPS)	12.0	# 27
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOTv2-L	Params(M)	65.6	# 19
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	SwinB-AOTv2-L (all frames, MS)	F-Measure (Seen)	90.7	# 2
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	SwinB-AOTv2-L (all frames, MS)	F-Measure (Unseen)	88.9	# 5
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	SwinB-AOTv2-L (all frames, MS)	Overall	86.5	# 4
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	SwinB-AOTv2-L (all frames, MS)	Jaccard (Seen)	85.6	# 2
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	SwinB-AOTv2-L (all frames, MS)	Jaccard (Unseen)	80.7	# 4
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	SwinB-AOTv2-L (all frames, MS)	Speed (FPS)	0.7	# 14
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	SwinB-AOTv2-L (all frames, MS)	Params(M)	65.6	# 21
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	SwinB-AOTv2-L (all frames)	F-Measure (Seen)	90.1	# 6
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	SwinB-AOTv2-L (all frames)	F-Measure (Unseen)	88.2	# 9
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	SwinB-AOTv2-L (all frames)	Overall	85.8	# 8
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	SwinB-AOTv2-L (all frames)	Jaccard (Unseen)	79.6	# 9
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	SwinB-AOTv2-L (all frames)	Speed (FPS)	5.1	# 13
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=3)	F-Measure (Seen)	88.8	# 16
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=3)	F-Measure (Unseen)	87.9	# 11
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=3)	Overall	85.0	# 13
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=3)	Jaccard (Seen)	83.8	# 15
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=3)	Jaccard (Unseen)	79.3	# 11
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=3)	Speed (FPS)	14.9	# 10
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=3)	Params(M)	15.4	# 17
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=1)	F-Measure (Seen)	86.1	# 35
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=1)	F-Measure (Unseen)	83.5	# 35
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=1)	Overall	81.6	# 35
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=1)	Jaccard (Seen)	81.4	# 35
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=1)	Jaccard (Unseen)	75.5	# 36
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=1)	Speed (FPS)	30.9	# 4
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=1)	Params(M)	12.5	# 11
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=2)	F-Measure (Seen)	88.5	# 18
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=2)	F-Measure (Unseen)	87.2	# 15
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=2)	Overall	84.5	# 16
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=2)	Jaccard (Seen)	83.5	# 20
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=2)	Jaccard (Unseen)	78.8	# 16
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=2)	Speed (FPS)	20.2	# 8
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=2)	Params(M)	13.9	# 13
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOTv2-L (all frames)	F-Measure (Seen)	90.2	# 5
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOTv2-L (all frames)	F-Measure (Unseen)	87.3	# 13
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOTv2-L (all frames)	Overall	85.4	# 11
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOTv2-L (all frames)	Jaccard (Seen)	85.1	# 6
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOTv2-L (all frames)	Jaccard (Unseen)	78.9	# 14
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOTv2-L (all frames)	Speed (FPS)	6.3	# 12
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOTv2-L (all frames)	Params(M)	15.1	# 16
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=1)	Overall	81.5	# 20
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=1)	Jaccard (Seen)	81.0	# 21
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=1)	Jaccard (Unseen)	754.8	# 1
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=1)	F-Measure (Seen)	85.6	# 20
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=1)	F-Measure (Unseen)	83.8	# 22
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	SwinB-AOTv2-L (all frames, MS)	Overall	86.5	# 3
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	SwinB-AOTv2-L (all frames, MS)	Jaccard (Seen)	85.5	# 2
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	SwinB-AOTv2-L (all frames, MS)	Jaccard (Unseen)	81.0	# 6
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	SwinB-AOTv2-L (all frames, MS)	F-Measure (Seen)	90.3	# 2
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	SwinB-AOTv2-L (all frames, MS)	F-Measure (Unseen)	89.1	# 4
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=3)	Overall	84.9	# 11
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=3)	Jaccard (Seen)	83.8	# 10
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=3)	Jaccard (Unseen)	79.3	# 13
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=3)	F-Measure (Seen)	88.7	# 11
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=3)	F-Measure (Unseen)	87.7	# 11
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=2)	Overall	84.3	# 14
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=2)	Jaccard (Seen)	83.3	# 15
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=2)	Jaccard (Unseen)	78.9	# 17
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=2)	F-Measure (Seen)	88.0	# 13
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=2)	F-Measure (Unseen)	87.1	# 15
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	SwinB-AOTv2-L (all frames)	Overall	85.2	# 9
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	SwinB-AOTv2-L (all frames)	Jaccard (Seen)	84.2	# 9
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	SwinB-AOTv2-L (all frames)	Jaccard (Unseen)	79.8	# 11
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	SwinB-AOTv2-L (all frames)	F-Measure (Seen)	88.9	# 9
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	SwinB-AOTv2-L (all frames)	F-Measure (Unseen)	88.0	# 10

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/associating-objects-with-scalable/visual-object-tracking-on-davis-2016)](https://paperswithcode.com/sota/visual-object-tracking-on-davis-2016?p=associating-objects-with-scalable)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/associating-objects-with-scalable/semi-supervised-video-object-segmentation-on-18)](https://paperswithcode.com/sota/semi-supervised-video-object-segmentation-on-18?p=associating-objects-with-scalable)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/associating-objects-with-scalable/semi-supervised-video-object-segmentation-on-1)](https://paperswithcode.com/sota/semi-supervised-video-object-segmentation-on-1?p=associating-objects-with-scalable)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/associating-objects-with-scalable/video-object-segmentation-on-youtube-vos)](https://paperswithcode.com/sota/video-object-segmentation-on-youtube-vos?p=associating-objects-with-scalable)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/associating-objects-with-scalable/visual-object-tracking-on-davis-2017)](https://paperswithcode.com/sota/visual-object-tracking-on-davis-2017?p=associating-objects-with-scalable)`

Scalable Video Object Segmentation with Identification Mechanism

22 Mar 2022 · Zongxin Yang, Jiaxu Miao, Yunchao Wei, Wenguan Wang, Xiaohan Wang, Yi Yang ·

This paper delves into the challenges of achieving scalable and effective multi-object modeling for semi-supervised Video Object Segmentation (VOS). Previous VOS methods decode features with a single positive object, limiting the learning of multi-object representation as they must match and segment each target separately under multi-object scenarios. Additionally, earlier techniques catered to specific application objectives and lacked the flexibility to fulfill different speed-accuracy requirements. To address these problems, we present two innovative approaches, Associating Objects with Transformers (AOT) and Associating Objects with Scalable Transformers (AOST). In pursuing effective multi-object modeling, AOT introduces the IDentification (ID) mechanism to allocate each object a unique identity. This approach enables the network to model the associations among all objects simultaneously, thus facilitating the tracking and segmentation of objects in a single network pass. To address the challenge of inflexible deployment, AOST further integrates scalable long short-term transformers that incorporate scalable supervision and layer-wise ID-based attention. This enables online architecture scalability in VOS for the first time and overcomes ID embeddings' representation limitations. Given the absence of a benchmark for VOS involving densely multi-object annotations, we propose a challenging Video Object Segmentation in the Wild (VOSW) benchmark to validate our approaches. We evaluated various AOT and AOST variants using extensive experiments across VOSW and five commonly used VOS benchmarks, including YouTube-VOS 2018 & 2019 Val, DAVIS-2017 Val & Test, and DAVIS-2016. Our approaches surpass the state-of-the-art competitors and display exceptional efficiency and scalability consistently across all six benchmarks. Project page: https://github.com/yoxu515/aot-benchmark.

PDF Abstract

Code

Add Remove Mark official

yoxu515/aot-benchmark official

560

z-x-yang/AOT official

116

Tasks

Add Remove

Object

Segmentation

Semantic Segmentation

Semi-Supervised Video Object Segmentation

Video Object Segmentation

Video Semantic Segmentation

Datasets

DAVIS

DAVIS 2017

DAVIS 2016

YouTube-VOS 2018

Referring Expressions for DAVIS 2016 & 2017 VIPSeg

Results from the Paper

Edit

Ranked #3 on Semi-Supervised Video Object Segmentation on YouTube-VOS 2019

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=1)	Jaccard (Mean)	89.6	# 24	Compare
			F-measure (Mean)	90.9	# 32	Compare
			J&F	90.3	# 30	Compare
			Speed (FPS)	37.4	# 10	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOTv2-L (MS)	Jaccard (Mean)	91.6	# 4	Compare
			F-measure (Mean)	94.4	# 3	Compare
			J&F	93.0	# 3	Compare
			Speed (FPS)	1.3	# 31	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOST (L'=3, MS)	Jaccard (Mean)	91.5	# 5	Compare
			F-measure (Mean)	94.5	# 2	Compare
			J&F	93.0	# 3	Compare
			Speed (FPS)	1.3	# 31	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=3)	Jaccard (Mean)	90.6	# 11	Compare
			F-measure (Mean)	93.6	# 11	Compare
			J&F	92.1	# 10	Compare
			Speed (FPS)	17.5	# 25	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOST (L'=3)	Jaccard (Mean)	90.5	# 13	Compare
			F-measure (Mean)	94.2	# 5	Compare
			J&F	92.4	# 7	Compare
			Speed (FPS)	12.0	# 29	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2016	SwinB-AOTv2-L	Jaccard (Mean)	90.6	# 11	Compare
			F-measure (Mean)	94.1	# 7	Compare
			J&F	92.4	# 7	Compare
			Speed (FPS)	12.0	# 29	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2016	R50-AOST (L'=2)	Jaccard (Mean)	90.5	# 13	Compare
			F-measure (Mean)	93.4	# 13	Compare
			J&F	92.0	# 11	Compare
			Speed (FPS)	24.3	# 21	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	R50-AOST (L'=3)	J&F	79.9	# 18	Compare
			Jaccard (Mean)	76.2	# 20	Compare
			F-measure (Mean)	83.6	# 18	Compare
			FPS	17.5	# 16	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOST (L'=3, MS)	J&F	84.7	# 4	Compare
			Jaccard (Mean)	80.9	# 5	Compare
			F-measure (Mean)	88.5	# 4	Compare
			FPS	1.3	# 20	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	R50-AOST (L'=2)	J&F	78.1	# 25	Compare
			Jaccard (Mean)	74.5	# 24	Compare
			F-measure (Mean)	81.7	# 25	Compare
			FPS	24.3	# 11	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOST (L'=3)	J&F	82.7	# 10	Compare
			Jaccard (Mean)	78.8	# 11	Compare
			F-measure (Mean)	86.6	# 9	Compare
			FPS	12.0	# 19	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2017 (test-dev)	SwinB-AOTv2-L	J&F	84.5	# 5	Compare
			Jaccard (Mean)	81.0	# 4	Compare
			F-measure (Mean)	87.9	# 5	Compare
			FPS	1.3	# 20	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOTv2-L (MS)	Jaccard (Mean)	84.2	# 9	Compare
			F-measure (Mean)	89.8	# 10	Compare
			J&F	87.0	# 11	Compare
			Speed (FPS)	1.3	# 29	Compare
			Params(M)	65.6	# 19	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=1)	Jaccard (Mean)	81.2	# 29	Compare
			F-measure (Mean)	86.1	# 31	Compare
			J&F	83.7	# 30	Compare
			Speed (FPS)	37.4	# 9	Compare
			Params(M)	12.5	# 9	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=2)	Jaccard (Mean)	82.5	# 18	Compare
			F-measure (Mean)	88.0	# 23	Compare
			J&F	85.3	# 20	Compare
			Speed (FPS)	24.3	# 14	Compare
			Params(M)	13.9	# 12	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	R50-AOST (L'=3)	Jaccard (Mean)	82.6	# 17	Compare
			F-measure (Mean)	88.5	# 19	Compare
			J&F	85.6	# 17	Compare
			Speed (FPS)	17.5	# 24	Compare
			Params(M)	15.4	# 14	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOST (L'=3, MS)	Jaccard (Mean)	83.8	# 12	Compare
			F-measure (Mean)	89.5	# 11	Compare
			J&F	86.7	# 12	Compare
			Speed (FPS)	1.3	# 29	Compare
			Params(M)	65.6	# 19	Compare
Semi-Supervised Video Object Segmentation	DAVIS 2017 (val)	SwinB-AOTv2-L	Jaccard (Mean)	83.1	# 13	Compare
			F-measure (Mean)	89.4	# 13	Compare
			J&F	86.3	# 13	Compare
			Speed (FPS)	12.0	# 27	Compare
			Params(M)	65.6	# 19	Compare
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	SwinB-AOTv2-L (all frames, MS)	F-Measure (Seen)	90.7	# 2	Compare
			F-Measure (Unseen)	88.9	# 5	Compare
			Overall	86.5	# 4	Compare
			Jaccard (Seen)	85.6	# 2	Compare
			Jaccard (Unseen)	80.7	# 4	Compare
			Speed (FPS)	0.7	# 14	Compare
			Params(M)	65.6	# 21	Compare
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	SwinB-AOTv2-L (all frames)	F-Measure (Seen)	90.1	# 6	Compare
			F-Measure (Unseen)	88.2	# 9	Compare
			Overall	85.8	# 8	Compare
			Jaccard (Unseen)	79.6	# 9	Compare
			Speed (FPS)	5.1	# 13	Compare
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=3)	F-Measure (Seen)	88.8	# 16	Compare
			F-Measure (Unseen)	87.9	# 11	Compare
			Overall	85.0	# 13	Compare
			Jaccard (Seen)	83.8	# 15	Compare
			Jaccard (Unseen)	79.3	# 11	Compare
			Speed (FPS)	14.9	# 10	Compare
			Params(M)	15.4	# 17	Compare
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=1)	F-Measure (Seen)	86.1	# 35	Compare
			F-Measure (Unseen)	83.5	# 35	Compare
			Overall	81.6	# 35	Compare
			Jaccard (Seen)	81.4	# 35	Compare
			Jaccard (Unseen)	75.5	# 36	Compare
			Speed (FPS)	30.9	# 4	Compare
			Params(M)	12.5	# 11	Compare
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOST (L'=2)	F-Measure (Seen)	88.5	# 18	Compare
			F-Measure (Unseen)	87.2	# 15	Compare
			Overall	84.5	# 16	Compare
			Jaccard (Seen)	83.5	# 20	Compare
			Jaccard (Unseen)	78.8	# 16	Compare
			Speed (FPS)	20.2	# 8	Compare
			Params(M)	13.9	# 13	Compare
Semi-Supervised Video Object Segmentation	YouTube-VOS 2018	R50-AOTv2-L (all frames)	F-Measure (Seen)	90.2	# 5	Compare
			F-Measure (Unseen)	87.3	# 13	Compare
			Overall	85.4	# 11	Compare
			Jaccard (Seen)	85.1	# 6	Compare
			Jaccard (Unseen)	78.9	# 14	Compare
			Speed (FPS)	6.3	# 12	Compare
			Params(M)	15.1	# 16	Compare
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=1)	Overall	81.5	# 20	Compare
			Jaccard (Seen)	81.0	# 21	Compare
			Jaccard (Unseen)	754.8	# 1	Compare
			F-Measure (Seen)	85.6	# 20	Compare
			F-Measure (Unseen)	83.8	# 22	Compare
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	SwinB-AOTv2-L (all frames, MS)	Overall	86.5	# 3	Compare
			Jaccard (Seen)	85.5	# 2	Compare
			Jaccard (Unseen)	81.0	# 6	Compare
			F-Measure (Seen)	90.3	# 2	Compare
			F-Measure (Unseen)	89.1	# 4	Compare
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=3)	Overall	84.9	# 11	Compare
			Jaccard (Seen)	83.8	# 10	Compare
			Jaccard (Unseen)	79.3	# 13	Compare
			F-Measure (Seen)	88.7	# 11	Compare
			F-Measure (Unseen)	87.7	# 11	Compare
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	R50-AOST (L'=2)	Overall	84.3	# 14	Compare
			Jaccard (Seen)	83.3	# 15	Compare
			Jaccard (Unseen)	78.9	# 17	Compare
			F-Measure (Seen)	88.0	# 13	Compare
			F-Measure (Unseen)	87.1	# 15	Compare
Semi-Supervised Video Object Segmentation	YouTube-VOS 2019	SwinB-AOTv2-L (all frames)	Overall	85.2	# 9	Compare
			Jaccard (Seen)	84.2	# 9	Compare
			Jaccard (Unseen)	79.8	# 11	Compare
			F-Measure (Seen)	88.9	# 9	Compare
			F-Measure (Unseen)	88.0	# 10	Compare

Methods

Add Remove

Absolute Position Encodings • Adam • BPE • Dense Connections • Dropout • Label Smoothing • Layer Normalization • Linear Layer • Multi-Head Attention • Position-Wise Feed-Forward Layer • Residual Connection • Scaled Dot-Product Attention • Softmax • Transformer • VOS

Edit Social Preview

Scalable Video Object Segmentation with Identification Mechanism

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove