Montezuma's Revenge is an ATARI 2600 Benchmark game that is known to be difficult to perform on for reinforcement learning algorithms. Solutions typically employ algorithms that incentivise environment exploration in different ways.
For the state-of-the art tables, please consult the parent Atari Games task.
In particular we establish state of the art performance on Montezuma's Revenge, a game famously difficult for deep reinforcement learning methods.
However, many of the state of the art deep reinforcement learning algorithms, that rely on epsilon-greedy, fail on these environments.
One successful method of guiding exploration in these domains is to imitate trajectories provided by a human demonstrator.
Our work is a simple extension of the paper "Exploration by Random Network Distillation".
Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms.