no code implementations • NeurIPS 2019 • Liangpeng Zhang, Ke Tang, Xin Yao
We argue that explicit planning for exploration can help alleviate such a problem, and propose a Value Iteration for Exploration Cost (VIEC) algorithm which computes the optimal exploration scheme by solving an augmented MDP.
no code implementations • NeurIPS 2017 • Liangpeng Zhang, Ke Tang, Xin Yao
Under/overestimation of state/action values are harmful for reinforcement learning agents.
no code implementations • 2 Dec 2016 • Liangpeng Zhang, Ke Tang, Xin Yao
We then provide empirical results to verify our approach, and demonstrate how the success probability of exploration can be used to analyse and predict the behaviours and possible outcomes of exploration, which are the keys to the answer of the important questions of exploration.
1 code implementation • 25 Jan 2016 • Guiying Li, Junlong Liu, Chunhui Jiang, Liangpeng Zhang, Minlong Lin, Ke Tang
R-CNN style methods are sorts of the state-of-the-art object detection methods, which consist of region proposal generation and deep CNN classification.