A K-fold Method for Baseline Estimation in Policy Gradient Algorithms

3 Jan 2017Nithyanand KotaAbhishek MishraSunil SrinivasaXiChenPieter Abbeel

The high variance issue in unbiased policy-gradient methods such as VPG and REINFORCE is typically mitigated by adding a baseline. However, the baseline fitting itself suffers from the underfitting or the overfitting problem... (read more)

PDF Abstract

Code


No code implementations yet. Submit your code now

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.