An Improved Convergence Analysis of Stochastic Variance-Reduced Policy Gradient

29 May 2019Pan XuFelicia GaoQuanquan Gu

We revisit the stochastic variance-reduced policy gradient (SVRPG) method proposed by Papini et al. (2018) for reinforcement learning. We provide an improved convergence analysis of SVRPG and show that it can find an $\epsilon$-approximate stationary point of the performance function within $O(1/\epsilon^{5/3})$ trajectories... (read more)

