Multi-step Greedy Policies in Model-Free Deep Reinforcement Learning

ICLR 2020 Anonymous

Multi-step greedy policies have been extensively used in model-based Reinforcement Learning (RL) and in the case when a model of the environment is available (e.g., in the game of Go). In this work, we explore the benefits of multi-step greedy policies in model-free RL when employed in the framework of multi-step Dynamic Programming (DP): multi-step Policy and Value Iteration... (read more)

PDF Abstract

Code


No code implementations yet. Submit your code now

Evaluation Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.