The Atari 2600 Games task (and dataset) involves training an agent to achieve high game scores.

Efficient exploration in complex environments remains a major challenge for reinforcement learning.

In this work, we build on recent advances in distributional reinforcement learning to give a generally applicable, flexible, and state-of-the-art distributional variant of DQN.

Experience replay lets online reinforcement learning agents remember and reuse experiences from the past.

We propose a conceptually simple and lightweight framework for deep reinforcement learning that uses asynchronous gradient descent for optimization of deep neural network controllers.

In recent years there have been many successes of using deep representations in reinforcement learning.

We present the first deep learning model to successfully learn control policies directly from high-dimensional sensory input using reinforcement learning.

Extending the idea of a locally consistent operator, we then derive sufficient conditions for an operator to preserve optimality, leading to a family of operators which includes our consistent Bellman operator.

Recently, researchers have made significant progress combining the advances in deep learning for learning feature representations with reinforcement learning.

In addition, our platform is flexible in terms of environment-agent communication topologies, choices of RL methods, changes in game parameters, and can host existing C/C++-based game environments like Arcade Learning Environment.

We obtain both state-of-the-art results and anecdotal evidence demonstrating the importance of the value distribution in approximate reinforcement learning.

