Browse > Computer Vision > Vision-Language Navigation

Vision-Language Navigation

7 papers with code · Computer Vision

Vision-language navigation (VLN) is the task of navigating an embodied agent to carry out natural language instructions inside real 3D environments.

Leaderboards

Greatest papers with code

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

CVPR 2019 extreme-assistant/cvpr2019

Vision-language navigation (VLN) is the task of navigating an embodied agent to carry out natural language instructions inside real 3D environments.

IMITATION LEARNING VISION-LANGUAGE NAVIGATION

The Regretful Navigation Agent for Vision-and-Language Navigation

CVPR 2019 (Oral) 2019 chihyaoma/regretful-agent

As deep learning continues to make progress for challenging perception tasks, there is increased interest in combining vision, language, and decision-making.

DECISION MAKING VISION-LANGUAGE NAVIGATION VISUAL NAVIGATION

The Regretful Agent: Heuristic-Aided Navigation through Progress Estimation

CVPR 2019 chihyaoma/regretful-agent

As deep learning continues to make progress for challenging perception tasks, there is increased interest in combining vision, language, and decision-making.

DECISION MAKING VISION-LANGUAGE NAVIGATION VISUAL NAVIGATION

Self-Monitoring Navigation Agent via Auxiliary Progress Estimation

ICLR 2019 chihyaoma/selfmonitoring-agent

The Vision-and-Language Navigation (VLN) task entails an agent following navigational instruction in photo-realistic unknown environments.

NATURAL LANGUAGE VISUAL GROUNDING VISION-LANGUAGE NAVIGATION VISUAL NAVIGATION

Learning to Navigate Unseen Environments: Back Translation with Environmental Dropout

NAACL 2019 airsplay/R2R-EnvDrop

Next, we apply semi-supervised learning (via back-translation) on these dropped-out environments to generate new paths and instructions.

VISION-LANGUAGE NAVIGATION

Tactical Rewind: Self-Correction via Backtracking in Vision-and-Language Navigation

CVPR 2019 Kelym/FAST

We present the Frontier Aware Search with backTracking (FAST) Navigator, a general framework for action decoding, that achieves state-of-the-art results on the Room-to-Room (R2R) Vision-and-Language navigation challenge of Anderson et.

VISION-LANGUAGE NAVIGATION

Cross-Lingual Vision-Language Navigation

24 Oct 2019zzxslp/Crosslingual-VLN

Besides, we introduce an adversarial domain adaption loss to improve the transferring ability of our model when given a certain amount of target language data.

DOMAIN ADAPTATION VISION-LANGUAGE NAVIGATION ZERO-SHOT LEARNING