Reconciling $λ$-Returns with Experience Replay

23 Oct 2018 Brett Daley Christopher Amato

Modern deep reinforcement learning methods have departed from the incremental learning required for eligibility traces, rendering the implementation of the $\lambda$-return difficult in this context. In particular, off-policy methods that utilize experience replay remain problematic because their random sampling of minibatches is not conducive to the efficient calculation of $\lambda$-returns... (read more)

PDF Abstract

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper