Montezuma's Revenge is an ATARI 2600 Benchmark game that is known to be difficult to perform on for reinforcement learning algorithms. Solutions typically employ algorithms that incentivise environment exploration in different ways.
For the state-of-the art tables, please consult the parent Atari Games task.
( Image credit: Q-map )
However, many of the state of the art deep reinforcement learning algorithms, that rely on epsilon-greedy, fail on these environments.
Go-Explore can also harness human-provided domain knowledge and, when augmented with it, scores a mean of over 650k points on Montezuma's Revenge.
Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms.
We consider an agent's uncertainty about its environment and the problem of generalizing this uncertainty across observations.
#4 best model for Atari Games on Atari 2600 Montezuma's Revenge
One successful method of guiding exploration in these domains is to imitate trajectories provided by a human demonstrator.