Vision and Language Navigation

88 papers with code • 5 benchmarks • 13 datasets

This task has no description! Would you like to contribute one?

Benchmarks

Add a Result

These leaderboards are used to track progress in Vision and Language Navigation

Dataset	Best Model	Compare
VLN Challenge	human	See all
Touchdown Dataset	ORAR + junction type + heading delta	See all
RxR	MARVAL	See all
map2seq	ORAR + junction type + heading delta	See all
robo-vln	Hierarchical Cross-Modal Agent	See all

Libraries

Use these libraries to find Vision and Language Navigation models and implementations

google-research/valan

2 papers

Datasets

Most implemented papers

Most implemented Social Latest No code

Speaker-Follower Models for Vision-and-Language Navigation

ronghanghu/speaker_follower • • NeurIPS 2018

We use this speaker model to (1) synthesize new instructions for data augmentation and to (2) implement pragmatic reasoning, which evaluates how well candidate action sequences explain an instruction.

Paper
Code

The Regretful Navigation Agent for Vision-and-Language Navigation

chihyaoma/regretful-agent • • CVPR 2019 (Oral) 2019

As deep learning continues to make progress for challenging perception tasks, there is increased interest in combining vision, language, and decision-making.

Paper
Code

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

Kelym/FAST • • CVPR 2019

We present the Frontier Aware Search with backTracking (FAST) Navigator, a general framework for action decoding, that achieves state-of-the-art results on the Room-to-Room (R2R) Vision-and-Language navigation challenge of Anderson et.

Paper
Code

REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments

YuankaiQi/REVERIE • • CVPR 2020

One of the long-term challenges of robotics is to enable robots to interact with humans in the visual world via natural language, as humans are visual animals that communicate through language.

Paper
Code

Chasing Ghosts: Instruction Following as Bayesian State Tracking

batra-mlp-lab/vln-chasing-ghosts • • NeurIPS 2019

Our experiments show that our approach outperforms a strong LingUNet baseline when predicting the goal location on the map.

Paper
Code

Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters

aimagelab/DynamicConv-agent • • 5 Jul 2019

In Vision-and-Language Navigation (VLN), an embodied agent needs to reach a target destination with the only guidance of a natural language instruction.

Paper
Code

Robust Navigation with Language Pretraining and Stochastic Sampling

xjli/r2r_vln • • IJCNLP 2019

Core to the vision-and-language navigation (VLN) challenge is building robust instruction representations and action decoding schemes, which can generalize well to previously unseen instructions and environments.

Paper
Code

Multimodal Attention Networks for Low-Level Vision-and-Language Navigation

aimagelab/perceive-transform-and-act • • 27 Nov 2019

Vision-and-Language Navigation (VLN) is a challenging task in which an agent needs to follow a language-specified path to reach a target destination.

Paper
Code

VALAN: Vision and Language Agent Navigation

google-research/valan • • 6 Dec 2019

VALAN is a lightweight and scalable software framework for deep reinforcement learning based on the SEED RL architecture.

Paper
Code

Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training

weituo12321/PREVALENT • CVPR 2020

By training on a large amount of image-text-action triplets in a self-supervised learning manner, the pre-trained model provides generic representations of visual environments and language instructions.

Paper
Code

Vision and Language Navigation

Benchmarks Add a Result

Libraries

Datasets

Most implemented papers

Content

Benchmarks

Add a Result