Value Prediction Network

NeurIPS 2017 Junhyuk OhSatinder SinghHonglak Lee

This paper proposes a novel deep reinforcement learning (RL) architecture, called Value Prediction Network (VPN), which integrates model-free and model-based RL methods into a single neural network. In contrast to typical model-based RL methods, VPN learns a dynamics model whose abstract states are trained to make option-conditional predictions of future values (discounted sum of rewards) rather than of future observations... (read more)

PDF Abstract

Results from the Paper


TASK DATASET MODEL METRIC NAME METRIC VALUE GLOBAL RANK RESULT LEADERBOARD
Atari Games Atari 2600 Alien VPN Score 1429 # 25
Atari Games Atari 2600 Amidar VPN Score 641 # 23
Atari Games Atari 2600 Crazy Climber VPN Score 54119 # 32
Atari Games Atari 2600 Enduro VPN Score 382 # 26
Atari Games Atari 2600 Frostbite VPN Score 3811 # 14
Atari Games Atari 2600 Krull VPN Score 15930 # 3
Atari Games Atari 2600 Ms. Pacman VPN Score 2689 # 20
Atari Games Atari 2600 Q*Bert VPN Score 14517 # 19
Atari Games Atari 2600 Seaquest VPN Score 5628 # 21