Shrinkage-based Bias-Variance Trade-off for Deep Reinforcement Learning

27 Sep 2018  ·  Yihao Feng, Hao liu, Jian Peng, Qiang Liu ·

Deep reinforcement learning has achieved remarkable successes in solving various challenging artificial intelligence tasks. A variety of different algorithms have been introduced and improved towards human-level performance. Although technical advances have been developed for each individual algorithms, there has been strong evidence showing that further substantial improvements can be achieved by properly combining multiple approaches with difference biases and variances. In this work, we propose to use the James-Stein (JS) shrinkage estimator to combine on-policy policy gradient estimators which have low bias but high variance, with low-variance high-bias gradient estimates such as those constructed based on model-based methods or temporally smoothed averaging of historical gradients. Empirical results show that our simple shrinkage approach is very effective in practice and substantially improve the sample efficiency of the state-of-the-art on-policy methods on various continuous control tasks.

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here