Visual Navigation
101 papers with code • 6 benchmarks • 16 datasets
Visual Navigation is the problem of navigating an agent, e.g. a mobile robot, in an environment using camera input only. The agent is given a target image (an image it will see from the target position), and its goal is to move from its current position to the target by applying a sequence of actions, based on the camera observations only.
Source: Vision-based Navigation Using Deep Reinforcement Learning
Libraries
Use these libraries to find Visual Navigation models and implementationsLatest papers
TTA-Nav: Test-time Adaptive Reconstruction for Point-Goal Navigation under Visual Corruptions
Our "plug-and-play" method incorporates a top-down decoder to a pre-trained navigation model.
MemoNav: Working Memory Model for Visual Navigation
Subsequently, a graph attention module encodes the retained STM and the LTM to generate working memory (WM) which contains the scene features essential for efficient navigation.
Towards Learning a Generalist Model for Embodied Navigation
We conduct extensive experiments to evaluate the performance and generalizability of our model.
What you see is what you get: Experience ranking with deep neural dataset-to-dataset similarity for topological localisation
In the case of localisation, important dataset differences impacting performance are modes of appearance change, including weather, lighting, and season.
Zero-Shot Object Goal Visual Navigation With Class-Independent Relationship Network
This method combines target detection information with the relative semantic similarity between the target and the navigation target, and constructs a brand new state representation based on similarity ranking, this state representation does not include target feature or environment feature, effectively decoupling the agent's navigation ability from target features.
CaMP: Causal Multi-policy Planning for Interactive Navigation in Multi-room Scenes
Visual navigation has been widely studied under the assumption that there may be several clear routes to reach the goal.
CaMP: Causal Multi-policy Planning for Interactive Navigation in Multi-room Scenes
Visual navigation has been widely studied under the assumption that there may be several clear routes to reach the goal.
VLN-PETL: Parameter-Efficient Transfer Learning for Vision-and-Language Navigation
The performance of the Vision-and-Language Navigation~(VLN) tasks has witnessed rapid progress recently thanks to the use of large pre-trained vision-and-language models.
Language-enhanced RNR-Map: Querying Renderable Neural Radiance Field maps with natural language
We present Le-RNR-Map, a Language-enhanced Renderable Neural Radiance map for Visual Navigation with natural language query prompts.
Scaling Data Generation in Vision-and-Language Navigation
Recent research in language-guided visual navigation has demonstrated a significant demand for the diversity of traversable environments and the quantity of supervision for training generalizable agents.