Visual Navigation is the problem of navigating an agent, e.g. a mobile robot, in an environment using camera input only. The agent is given a target image (an image it will see from the target position), and its goal is to move from its current position to the target by applying a sequence of actions, based on the camera observations only.
Source: Vision-based Navigation Using Deep Reinforcement Learning
We propose to using high level semantic and contextual features including segmentation and detection masks obtained by off-the-shelf state-of-the-art vision as observations and use deep network to learn the navigation policy.
The accumulated belief of the world enables the agent to track visited regions of the environment.
To minimize the number of cameras needed for surround perception, we utilize fisheye cameras.
Second, we investigate the sim2real predictivity of Habitat-Sim for PointGoal navigation.
Self-supervised learning aims to learn representations from the data itself without explicit manual supervision.
OBJECT DETECTION REPRESENTATION LEARNING SELF-SUPERVISED LEARNING VISUAL NAVIGATION
Nano-size unmanned aerial vehicles (UAVs), with few centimeters of diameter and sub-10 Watts of total power budget, have so far been considered incapable of running sophisticated visual-based autonomous navigation software without external aid from base-stations, ad-hoc local positioning infrastructure, and powerful external computation servers.
As part of our general methodology we discuss the software mapping techniques that enable the state-of-the-art deep convolutional neural network presented in [1] to be fully executed on-board within a strict 6 fps real-time constraint with no compromise in terms of flight results, while all processing is done with only 64 mW on average.
Learning a new task often requires both exploring to gather task-relevant information and exploiting this information to solve the task.
This is significant because a robot interpreting a natural-language navigation instruction on the basis of what it sees is carrying out a vision and language process that is similar to Visual Question Answering.
Ranked #3 on
Visual Navigation
on R2R
VISION AND LANGUAGE NAVIGATION VISUAL NAVIGATION VISUAL QUESTION ANSWERING
In this paper we study the problem of learning to learn at both training and test time in the context of visual navigation.