About

Vision-language navigation (VLN) is the task of navigating an embodied agent to carry out natural language instructions inside real 3D environments.

( Image credit: Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout )

Benchmarks

TREND DATASET BEST METHOD PAPER TITLE PAPER CODE COMPARE

Datasets

Greatest papers with code

Look Before You Leap: Bridging Model-Free and Model-Based Reinforcement Learning for Planned-Ahead Vision-and-Language Navigation

ECCV 2018 peteanderson80/Matterport3DSimulator

In this paper, we take a radical approach to bridge the gap between synthetic studies and real-world practices---We propose a novel, planned-ahead hybrid reinforcement learning model that combines model-free and model-based reinforcement learning to solve a real-world vision-language navigation task.

MODEL-BASED REINFORCEMENT LEARNING ROBOT NAVIGATION VISION AND LANGUAGE NAVIGATION VISION-LANGUAGE NAVIGATION

The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation

CVPR 2019 chihyaoma/regretful-agent

As deep learning continues to make progress for challenging perception tasks, there is increased interest in combining vision, language, and decision-making.

DECISION MAKING VISION AND LANGUAGE NAVIGATION VISION-LANGUAGE NAVIGATION VISUAL NAVIGATION

The Regretful Navigation Agent for Vision-and-Language Navigation

CVPR 2019 (Oral) 2019 chihyaoma/regretful-agent

As deep learning continues to make progress for challenging perception tasks, there is increased interest in combining vision, language, and decision-making.

DECISION MAKING VISION AND LANGUAGE NAVIGATION VISION-LANGUAGE NAVIGATION VISUAL NAVIGATION

Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout

NAACL 2019 airsplay/R2R-EnvDrop

Next, we apply semi-supervised learning (via back-translation) on these dropped-out environments to generate new paths and instructions.

VISION-LANGUAGE NAVIGATION

Environment-agnostic Multitask Learning for Natural Language Grounded Navigation

ECCV 2020 google-research/valan

Recent research efforts enable study for natural language grounded navigation in photo-realistic environments, e. g., following natural language instructions or dialog.

VISION-LANGUAGE NAVIGATION

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

CVPR 2019 Kelym/FAST

We present the Frontier Aware Search with backTracking (FAST) Navigator, a general framework for action decoding, that achieves state-of-the-art results on the Room-to-Room (R2R) Vision-and-Language navigation challenge of Anderson et.

VISION AND LANGUAGE NAVIGATION VISION-LANGUAGE NAVIGATION

Active Visual Information Gathering for Vision-Language Navigation

ECCV 2020 HanqingWangAI/Active_VLN

Vision-language navigation (VLN) is the task of entailing an agent to carry out navigational instructions inside photo-realistic environments.

VISION-LANGUAGE NAVIGATION

Structured Scene Memory for Vision-Language Navigation

5 Mar 2021HanqingWangAI/SSM-VLN

Recently, numerous algorithms have been developed to tackle the problem of vision-language navigation (VLN), i. e., entailing an agent to navigate 3D environments through following linguistic instructions.

DECISION MAKING VISION-LANGUAGE NAVIGATION

Cross-Lingual Vision-Language Navigation

24 Oct 2019zzxslp/Crosslingual-VLN

Commanding a robot to navigate with natural language instructions is a long-term goal for grounded language understanding and robotics.

DOMAIN ADAPTATION VISION-LANGUAGE NAVIGATION ZERO-SHOT LEARNING