Towards Cognitive Exploration through Deep Reinforcement Learning for Mobile Robots

6 Oct 2016  ·  Lei Tai, Ming Liu ·

Exploration in an unknown environment is the core functionality for mobile robots. Learning-based exploration methods, including convolutional neural networks, provide excellent strategies without human-designed logic for the feature extraction. But the conventional supervised learning algorithms cost lots of efforts on the labeling work of datasets inevitably. Scenes not included in the training set are mostly unrecognized either. We propose a deep reinforcement learning method for the exploration of mobile robots in an indoor environment with the depth information from an RGB-D sensor only. Based on the Deep Q-Network framework, the raw depth image is taken as the only input to estimate the Q values corresponding to all moving commands. The training of the network weights is end-to-end. In arbitrarily constructed simulation environments, we show that the robot can be quickly adapted to unfamiliar scenes without any man-made labeling. Besides, through analysis of receptive fields of feature representations, deep reinforcement learning motivates the convolutional networks to estimate the traversability of the scenes. The test results are compared with the exploration strategies separately based on deep learning or reinforcement learning. Even trained only in the simulated environment, experimental results in real-world environment demonstrate that the cognitive ability of robot controller is dramatically improved compared with the supervised method. We believe it is the first time that raw sensor information is used to build cognitive exploration strategy for mobile robots through end-to-end deep reinforcement learning.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here