#Exploration: A Study of Count-Based Exploration for Deep Reinforcement Learning

NeurIPS 2017 Haoran TangRein HouthooftDavis FooteAdam StookeXi ChenYan DuanJohn SchulmanFilip De TurckPieter Abbeel

Count-based exploration algorithms are known to perform near-optimally when used in conjunction with tabular reinforcement learning (RL) methods for solving small discrete Markov decision processes (MDPs). It is generally thought that count-based methods cannot be applied in high-dimensional state spaces, since most states will only occur once... (read more)

PDF Abstract

Evaluation results from the paper

Task Dataset Model Metric name Metric value Global rank Compare
Atari Games Atari 2600 Freeway TRPO-hash Score 34.0 # 1
Atari Games Atari 2600 Frostbite TRPO-hash Score 5214.0 # 2
Atari Games Atari 2600 Montezuma's Revenge TRPO-hash Score 75.0 # 13
Atari Games Atari 2600 Venture TRPO-hash Score 445.0 # 9