Challenges of Context and Time in Reinforcement Learning: Introducing Space Fortress as a Benchmark

6 Sep 2018  ยท  Akshat Agarwal, Ryan Hope, Katia Sycara ยท

Research in deep reinforcement learning (RL) has coalesced around improving performance on benchmarks like the Arcade Learning Environment. However, these benchmarks conspicuously miss important characteristics like abrupt context-dependent shifts in strategy and temporal sensitivity that are often present in real-world domains. As a result, RL research has not focused on these challenges, resulting in algorithms which do not understand critical changes in context, and have little notion of real world time. To tackle this issue, this paper introduces the game of Space Fortress as a RL benchmark which incorporates these characteristics. We show that existing state-of-the-art RL algorithms are unable to learn to play the Space Fortress game. We then confirm that this poor performance is due to the RL algorithms' context insensitivity and reward sparsity. We also identify independent axes along which to vary context and temporal sensitivity, allowing Space Fortress to be used as a testbed for understanding both characteristics in combination and also in isolation. We release Space Fortress as an open-source Gym environment.

PDF Abstract

Results from the Paper


Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Space Fortress Autoturn PPO (SF-GRU) Average Score 2510 # 1
Best Score 2870 # 1
Fortress Death 43 # 1
Space Fortress Autoturn Rainbow Average Score -2973 # 3
Best Score -2330 # 3
Fortress Death 1.2 # 4
Space Fortress Autoturn A2C (SF-GRU) Average Score -1641 # 3
Best Score -718 # 3
Fortress Death 3 # 3
Space Fortress Autoturn PPO (SF-FF) Average Score 2337 # 2
Best Score 2818 # 2
Fortress Death 41 # 2
Space Fortress Youturn PPO (SF-GRU) Average Score 2356 # 1
Best Score 2932 # 1
Fortress Death 41 # 1
Space Fortress Youturn Rainbow Average Score -4112 # 3
Best Score -3934 # 3
Fortress Death 0 # 4
Space Fortress Youturn A2C (SF-GRU) Average Score -2444 # 3
Best Score -1700 # 3
Fortress Death 11 # 3
Space Fortress Youturn PPO (SF-FF) Average Score 2235 # 2
Best Score 2880 # 2
Fortress Death 40 # 2

Methods


No methods listed for this paper. Add relevant methods here