Deep Reinforcement Learning with Double Q-learning

22 Sep 2015 Hado van Hasselt Arthur Guez David Silver

The popular Q-learning algorithm is known to overestimate action values under certain conditions. It was not previously known whether, in practice, such overestimations are common, whether they harm performance, and whether they can generally be prevented... (read more)

