Vision and Language Navigation

43 papers with code • 4 benchmarks • 13 datasets

This task has no description! Would you like to contribute one?

Libraries

Use these libraries to find Vision and Language Navigation models and implementations

Most implemented papers

Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments

peteanderson80/Matterport3DSimulator CVPR 2018

This is significant because a robot interpreting a natural-language navigation instruction on the basis of what it sees is carrying out a vision and language process that is similar to Visual Question Answering.

Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments

lil-lab/touchdown CVPR 2019

We study the problem of jointly reasoning about language and vision through a navigation and spatial reasoning task.

Retouchdown: Adding Touchdown to StreetLearn as a Shareable Resource for Language Grounding Tasks in Street View

google-research/valan 10 Jan 2020

These have been added to the StreetLearn dataset and can be obtained via the same process as used previously for StreetLearn.

The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation

chihyaoma/regretful-agent CVPR 2019

As deep learning continues to make progress for challenging perception tasks, there is increased interest in combining vision, language, and decision-making.

Beyond the Nav-Graph: Vision-and-Language Navigation in Continuous Environments

jacobkrantz/VLN-CE ECCV 2020

We develop a language-guided navigation task set in a continuous 3D environment where agents must execute low-level actions to follow natural language navigation directions.

Self-Monitoring Navigation Agent via Auxiliary Progress Estimation

chihyaoma/selfmonitoring-agent ICLR 2019

The Vision-and-Language Navigation (VLN) task entails an agent following navigational instruction in photo-realistic unknown environments.

Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation

peteanderson80/Matterport3DSimulator ECCV 2018

In this paper, we take a radical approach to bridge the gap between synthetic studies and real-world practices---We propose a novel, planned-ahead hybrid reinforcement learning model that combines model-free and model-based reinforcement learning to solve a real-world vision-language navigation task.

Speaker-Follower Models for Vision-and-Language Navigation

ronghanghu/speaker_follower NeurIPS 2018

We use this speaker model to (1) synthesize new instructions for data augmentation and to (2) implement pragmatic reasoning, which evaluates how well candidate action sequences explain an instruction.

The Regretful Navigation Agent for Vision-and-Language Navigation

chihyaoma/regretful-agent CVPR 2019 (Oral) 2019

As deep learning continues to make progress for challenging perception tasks, there is increased interest in combining vision, language, and decision-making.