Qualitative Measurements of Policy Discrepancy for Return-Based Deep Q-Network

14 Jun 2018 Wenjia Meng Qian Zheng Long Yang Pengfei Li Gang Pan

The deep Q-network (DQN) and return-based reinforcement learning are two promising algorithms proposed in recent years. DQN brings advances to complex sequential decision problems, while return-based algorithms have advantages in making use of sample trajectories... (read more)

PDF Abstract
No code implementations yet. Submit your code now

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper