In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN.
SOTA for Atari Games on Atari 2600 Pitfall!
In this paper, we build on recent work advocating a distributional approach to reinforcement learning in which the distribution over returns is modeled explicitly instead of only estimating the mean.
In this paper, we propose the Quantile Option Architecture (QUOTA) for exploration based on recent advances in distributional reinforcement learning (RL).
Our method combines elements from distributional reinforcement learning and approximate Bayesian inference techniques with neural networks, allowing us to disentangle both types of uncertainty on the expected return of a policy.