TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Referring Expression Comprehension	Talk2Car	Stacked VLBert	AP50	71.0	# 3
Referring Expression Comprehension	Talk2Car	CMRT	AP50	69.1	# 4
Referring Expression Comprehension	Talk2Car	ASSMR	AP50	66.0	# 7

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/commands-4-autonomous-vehicles-c4av-workshop/referring-expression-comprehension-on-2)](https://paperswithcode.com/sota/referring-expression-comprehension-on-2?p=commands-4-autonomous-vehicles-c4av-workshop)`

Commands 4 Autonomous Vehicles (C4AV) Workshop Summary

18 Sep 2020 · Thierry Deruyttere, Simon Vandenhende, Dusan Grujicic, Yu Liu, Luc van Gool, Matthew Blaschko, Tinne Tuytelaars, Marie-Francine Moens ·

The task of visual grounding requires locating the most relevant region or object in an image, given a natural language query. So far, progress on this task was mostly measured on curated datasets, which are not always representative of human spoken language. In this work, we deviate from recent, popular task settings and consider the problem under an autonomous vehicle scenario. In particular, we consider a situation where passengers can give free-form natural language commands to a vehicle which can be associated with an object in the street scene. To stimulate research on this topic, we have organized the \emph{Commands for Autonomous Vehicles} (C4AV) challenge based on the recent \emph{Talk2Car} dataset (URL: https://www.aicrowd.com/challenges/eccv-2020-commands-4-autonomous-vehicles). This paper presents the results of the challenge. First, we compare the used benchmark against existing datasets for visual grounding. Second, we identify the aspects that render top-performing models successful, and relate them to existing state-of-the-art models for visual grounding, in addition to detecting potential failure cases by evaluating on carefully selected subsets. Finally, we discuss several possibilities for future work.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Autonomous Vehicles

Referring Expression Comprehension

Visual Grounding

Datasets

MS COCO

Visual Question Answering

nuScenes

RefCOCO

Talk2Car

Results from the Paper

Add Remove

Ranked #3 on Referring Expression Comprehension on Talk2Car

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Referring Expression Comprehension	Talk2Car	Stacked VLBert	AP50	71.0	# 3	Compare
Referring Expression Comprehension	Talk2Car	CMRT	AP50	69.1	# 4	Compare
Referring Expression Comprehension	Talk2Car	ASSMR	AP50	66.0	# 7	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Commands 4 Autonomous Vehicles (C4AV) Workshop Summary

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit Add Remove

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Add Remove

Methods

Add Remove