TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Vision-Language Navigation	Room2Room	R2R+EnvDrop	spl	0.61	# 1
Vision and Language Navigation	VLN Challenge	null	success	0.69	# 23
Vision and Language Navigation	VLN Challenge	null	length	686.82	# 7
Vision and Language Navigation	VLN Challenge	null	error	3.26	# 132
Vision and Language Navigation	VLN Challenge	null	oracle success	0.99	# 2
Vision and Language Navigation	VLN Challenge	null	spl	0.01	# 134
Vision and Language Navigation	VLN Challenge	Back Translation with Environmental Dropout (no beam search)	success	0.51	# 110
Vision and Language Navigation	VLN Challenge	Back Translation with Environmental Dropout (no beam search)	length	11.66	# 102
Vision and Language Navigation	VLN Challenge	Back Translation with Environmental Dropout (no beam search)	error	5.23	# 45
Vision and Language Navigation	VLN Challenge	Back Translation with Environmental Dropout (no beam search)	oracle success	0.59	# 111
Vision and Language Navigation	VLN Challenge	Back Translation with Environmental Dropout (no beam search)	spl	0.47	# 89

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-to-navigate-unseen-environments-back/vision-language-navigation-on-room2room)](https://paperswithcode.com/sota/vision-language-navigation-on-room2room?p=learning-to-navigate-unseen-environments-back)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/learning-to-navigate-unseen-environments-back/vision-and-language-navigation-on-vln)](https://paperswithcode.com/sota/vision-and-language-navigation-on-vln?p=learning-to-navigate-unseen-environments-back)`

Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout

NAACL 2019 · Hao Tan, Licheng Yu, Mohit Bansal ·

A grand goal in AI is to build a robot that can accurately navigate based on natural language instructions, which requires the agent to perceive the scene, understand and ground language, and act in the real-world environment. One key challenge here is to learn to navigate in new environments that are unseen during training. Most of the existing approaches perform dramatically worse in unseen environments as compared to seen ones. In this paper, we present a generalizable navigational agent. Our agent is trained in two stages. The first stage is training via mixed imitation and reinforcement learning, combining the benefits from both off-policy and on-policy optimization. The second stage is fine-tuning via newly-introduced 'unseen' triplets (environment, path, instruction). To generate these unseen triplets, we propose a simple but effective 'environmental dropout' method to mimic unseen environments, which overcomes the problem of limited seen environment variability. Next, we apply semi-supervised learning (via back-translation) on these dropped-out environments to generate new paths and instructions. Empirically, we show that our agent is substantially better at generalizability when fine-tuned with these triplets, outperforming the state-of-art approaches by a large margin on the private unseen test set of the Room-to-Room task, and achieving the top rank on the leaderboard.

PDF Abstract NAACL 2019 PDF NAACL 2019 Abstract

Code

Add Remove Mark official

airsplay/R2R-EnvDrop official

120

Tasks

Add Remove

Navigate

Translation

Vision-Language Navigation

Datasets

Matterport3D

R2R

Results from the Paper

Edit

Ranked #1 on Vision-Language Navigation on Room2Room

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Vision-Language Navigation	Room2Room	R2R+EnvDrop	spl	0.61	# 1	Compare
Vision and Language Navigation	VLN Challenge	null	success	0.69	# 23	Compare
			length	686.82	# 7	Compare
			error	3.26	# 132	Compare
			oracle success	0.99	# 2	Compare
			spl	0.01	# 134	Compare
Vision and Language Navigation	VLN Challenge	Back Translation with Environmental Dropout (no beam search)	success	0.51	# 110	Compare
			length	11.66	# 102	Compare
			error	5.23	# 45	Compare
			oracle success	0.59	# 111	Compare
			spl	0.47	# 89	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove