A Distributional Perspective on Reinforcement Learning

ICML 2017  ·  Marc G. Bellemare, Will Dabney, Rémi Munos ·

In this paper we argue for the fundamental importance of the value distribution: the distribution of the random return received by a reinforcement learning agent. This is in contrast to the common approach to reinforcement learning which models the expectation of this return, or value. Although there is an established body of literature studying the value distribution, thus far it has always been used for a specific purpose such as implementing risk-aware behaviour. We begin with theoretical results in both the policy evaluation and control settings, exposing a significant distributional instability in the latter. We then use the distributional perspective to design a new algorithm which applies Bellman's equation to the learning of approximate value distributions. We evaluate our algorithm using the suite of games from the Arcade Learning Environment. We obtain both state-of-the-art results and anecdotal evidence demonstrating the importance of the value distribution in approximate reinforcement learning. Finally, we combine theoretical and empirical evidence to highlight the ways in which the value distribution impacts learning in the approximate setting.

PDF Abstract ICML 2017 PDF ICML 2017 Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Atari Games Atari 2600 Alien C51 noop Score 3166.0 # 26
Atari Games Atari 2600 Amidar C51 noop Score 1735.0 # 14
Atari Games Atari 2600 Assault C51 noop Score 7203.0 # 23
Atari Games Atari 2600 Asterix C51 noop Score 406211 # 9
Atari Games Atari 2600 Asteroids C51 noop Score 1516.0 # 33
Atari Games Atari 2600 Atlantis C51 noop Score 841075.0 # 22
Atari Games Atari 2600 Bank Heist C51 noop Score 976.0 # 26
Atari Games Atari 2600 Battle Zone C51 noop Score 28742.0 # 28
Atari Games Atari 2600 Beam Rider C51 noop Score 14074.0 # 26
Atari Games Atari 2600 Berzerk C51 noop Score 1645.0 # 19
Atari Games Atari 2600 Bowling C51 noop Score 81.8 # 15
Atari Games Atari 2600 Boxing C51 noop Score 97.8 # 22
Atari Games Atari 2600 Breakout C51 noop Score 748.0 # 14
Atari Games Atari 2600 Centipede C51 noop Score 9646.0 # 19
Atari Games Atari 2600 Chopper Command C51 noop Score 15600.0 # 14
Atari Games Atari 2600 Crazy Climber C51 noop Score 179877.0 # 10
Atari Games Atari 2600 Demon Attack C51 noop Score 130955.0 # 11
Atari Games Atari 2600 Double Dunk C51 noop Score 2.5 # 18
Atari Games Atari 2600 Enduro C51 noop Score 3454.0 # 4
Atari Games Atari 2600 Fishing Derby C51 noop Score 8.9 # 33
Atari Games Atari 2600 Freeway C51 noop Score 33.9 # 10
Atari Games Atari 2600 Frostbite C51 noop Score 3965.0 # 21
Atari Games Atari 2600 Gopher C51 noop Score 33641.0 # 19
Atari Games Atari 2600 Gravitar C51 noop Score 440.0 # 35
Atari Games Atari 2600 HERO C51 noop Score 38874 # 4
Atari Games Atari 2600 Ice Hockey C51 noop Score -3.5 # 20
Atari Games Atari 2600 James Bond C51 noop Score 1909.0 # 18
Atari Games Atari 2600 Kangaroo C51 noop Score 12853.0 # 19
Atari Games Atari 2600 Krull C51 noop Score 9735.0 # 20
Atari Games Atari 2600 Kung-Fu Master C51 noop Score 48192.0 # 18
Atari Games Atari 2600 Ms. Pacman C51 noop Score 3415.0 # 25
Atari Games Atari 2600 Name This Game C51 noop Score 12542.0 # 20
Atari Games Atari 2600 Pong C51 noop Score 20.9 # 17
Atari Games Atari 2600 Private Eye C51 noop Score 15095.0 # 8
Atari Games Atari 2600 Q*Bert C51 noop Score 23784 # 18
Atari Games Atari 2600 River Raid C51 noop Score 17322.0 # 15
Atari Games Atari 2600 Road Runner C51 noop Score 55839.0 # 22
Atari Games Atari 2600 Robotank C51 noop Score 52.3 # 28
Atari Games Atari 2600 Seaquest C51 noop Score 266434.0 # 10
Atari Games Atari 2600 Space Invaders C51 noop Score 5747.0 # 21
Atari Games Atari 2600 Star Gunner C51 noop Score 49095.0 # 33
Atari Games Atari 2600 Tennis C51 noop Score 23.1 # 8
Atari Games Atari 2600 Time Pilot C51 noop Score 8329.0 # 28
Atari Games Atari 2600 Tutankham C51 noop Score 280.0 # 12
Atari Games Atari 2600 Up and Down C51 noop Score 15612.0 # 33
Atari Games Atari 2600 Venture C51 noop Score 1520.0 # 13
Atari Games Atari 2600 Video Pinball C51 noop Score 949604.0 # 5
Atari Games Atari 2600 Wizard of Wor C51 noop Score 9300.0 # 21
Atari Games Atari 2600 Zaxxon C51 noop Score 10513.0 # 24

Methods


No methods listed for this paper. Add relevant methods here