TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Referring Expression Segmentation	A2Dre test	RefVos	Overall IoU	47.5	# 1
Referring Expression Segmentation	A2Dre test	RefVos	Mean IoU	33.2	# 1
Referring Expression Segmentation	A2D Sentences	RefVOS	Precision@0.5	0.495	# 23
Referring Expression Segmentation	A2D Sentences	RefVOS	Precision@0.9	0.064	# 18
Referring Expression Segmentation	A2D Sentences	RefVOS	IoU overall	0.599	# 22
Referring Expression Segmentation	A2D Sentences	RefVOS	IoU mean	0.599	# 9
Referring Expression Segmentation	DAVIS 2017 (val)	RefVOS	J&F 1st frame	44.5	# 11
Referring Expression Segmentation	DAVIS 2017 (val)	RefVOS	J&F Full video	45.1	# 3
Referring Expression Segmentation	RefCOCO testA	RefVOS with BERT Pre-train	Overall IoU	63.19	# 19
Referring Expression Segmentation	RefCOCO testA	RefVos with Bi-LSTM	Overall IoU	52.90	# 24
Referring Expression Segmentation	RefCOCO+ testA	RefVOS with BERT + MLM Loss	Overall IoU	49.73	# 19
Referring Expression Segmentation	RefCOCO testB	RefVOS with BERT Pre-train	Overall IoU	54.17	# 18
Referring Expression Segmentation	RefCOCO+ test B	RefVOS with BERT + MLM loss	Overall IoU	36.17	# 20
Referring Expression Segmentation	RefCoCo val	RefVOS with BERT Pre-train	Overall IoU	58.65	# 22
Referring Expression Segmentation	RefCoCo val	RefVOS with BERT + MLM loss	Overall IoU	59.45	# 20
Referring Expression Segmentation	RefCOCO+ val	RefVOS with BERT + MLM loss	Overall IoU	44.71	# 21

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/refvos-a-closer-look-at-referring-expressions/referring-expression-segmentation-on-a2dre)](https://paperswithcode.com/sota/referring-expression-segmentation-on-a2dre?p=refvos-a-closer-look-at-referring-expressions)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/refvos-a-closer-look-at-referring-expressions/referring-expression-segmentation-on-a2d)](https://paperswithcode.com/sota/referring-expression-segmentation-on-a2d?p=refvos-a-closer-look-at-referring-expressions)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/refvos-a-closer-look-at-referring-expressions/referring-expression-segmentation-on-davis)](https://paperswithcode.com/sota/referring-expression-segmentation-on-davis?p=refvos-a-closer-look-at-referring-expressions)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/refvos-a-closer-look-at-referring-expressions/referring-expression-segmentation-on-refcoco-2)](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco-2?p=refvos-a-closer-look-at-referring-expressions)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/refvos-a-closer-look-at-referring-expressions/referring-expression-segmentation-on-refcoco-1)](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco-1?p=refvos-a-closer-look-at-referring-expressions)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/refvos-a-closer-look-at-referring-expressions/referring-expression-segmentation-on-refcoco-4)](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco-4?p=refvos-a-closer-look-at-referring-expressions)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/refvos-a-closer-look-at-referring-expressions/referring-expression-segmentation-on-refcoco-5)](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco-5?p=refvos-a-closer-look-at-referring-expressions)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/refvos-a-closer-look-at-referring-expressions/referring-expression-segmentation-on-refcoco)](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco?p=refvos-a-closer-look-at-referring-expressions)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/refvos-a-closer-look-at-referring-expressions/referring-expression-segmentation-on-refcoco-3)](https://paperswithcode.com/sota/referring-expression-segmentation-on-refcoco-3?p=refvos-a-closer-look-at-referring-expressions)`

RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation

1 Oct 2020 · Miriam Bellver, Carles Ventura, Carina Silberer, Ioannis Kazakos, Jordi Torres, Xavier Giro-i-Nieto ·

The task of video object segmentation with referring expressions (language-guided VOS) is to, given a linguistic phrase and a video, generate binary masks for the object to which the phrase refers. Our work argues that existing benchmarks used for this task are mainly composed of trivial cases, in which referents can be identified with simple phrases. Our analysis relies on a new categorization of the phrases in the DAVIS-2017 and Actor-Action datasets into trivial and non-trivial REs, with the non-trivial REs annotated with seven RE semantic categories. We leverage this data to analyze the results of RefVOS, a novel neural network that obtains competitive results for the task of language-guided image segmentation and state of the art results for language-guided VOS. Our study indicates that the major challenges for the task are related to understanding motion and static actions.

PDF Abstract

Code

Add Remove Mark official

miriambellver/refvos official

imatge-upc/refvos

Tasks

Add Remove

Image Segmentation

Referring Expression Segmentation

Segmentation

Video Object Segmentation

Datasets

Introduced in the Paper:

A2Dre

A2Dre+

Used in the Paper:

MS COCO

DAVIS

CLEVR

RefCOCO

DAVIS 2017

Referring Expressions for DAVIS 2016 & 2017

A2D Sentences

Results from the Paper

Edit

Ranked #1 on Referring Expression Segmentation on A2Dre test

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Referring Expression Segmentation	A2Dre test	RefVos	Overall IoU	47.5	# 1	Compare
Referring Expression Segmentation	A2Dre test	RefVos	Mean IoU	33.2	# 1	Compare
Referring Expression Segmentation	A2D Sentences	RefVOS	Precision@0.5	0.495	# 23	Compare
			Precision@0.9	0.064	# 18	Compare
			IoU overall	0.599	# 22	Compare
			IoU mean	0.599	# 9	Compare
Referring Expression Segmentation	DAVIS 2017 (val)	RefVOS	J&F 1st frame	44.5	# 11	Compare
Referring Expression Segmentation	DAVIS 2017 (val)	RefVOS	J&F Full video	45.1	# 3	Compare
Referring Expression Segmentation	RefCOCO testA	RefVOS with BERT Pre-train	Overall IoU	63.19	# 19	Compare
Referring Expression Segmentation	RefCOCO testA	RefVos with Bi-LSTM	Overall IoU	52.90	# 24	Compare
Referring Expression Segmentation	RefCOCO+ testA	RefVOS with BERT + MLM Loss	Overall IoU	49.73	# 19	Compare
Referring Expression Segmentation	RefCOCO testB	RefVOS with BERT Pre-train	Overall IoU	54.17	# 18	Compare
Referring Expression Segmentation	RefCOCO+ test B	RefVOS with BERT + MLM loss	Overall IoU	36.17	# 20	Compare
Referring Expression Segmentation	RefCoCo val	RefVOS with BERT Pre-train	Overall IoU	58.65	# 22	Compare
Referring Expression Segmentation	RefCoCo val	RefVOS with BERT + MLM loss	Overall IoU	59.45	# 20	Compare
Referring Expression Segmentation	RefCOCO+ val	RefVOS with BERT + MLM loss	Overall IoU	44.71	# 21	Compare

Methods

Add Remove

1x1 Convolution • Adam • Attention Dropout • BERT • Convolution • Dense Connections • Dilated Convolution • Dropout • GELU • Grouped Convolution • Layer Normalization • Linear Layer • Linear Warmup With Linear Decay • Multi-Head Attention • Multiscale Dilated Convolution Block • Residual Connection • Scaled Dot-Product Attention • Softmax • VOS • Weight Decay • WordPiece

Edit Social Preview

RefVOS: A Closer Look at Referring Expressions for Video Object Segmentation

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove