Montezuma's Revenge is an ATARI 2600 Benchmark game that is known to be difficult to perform on for reinforcement learning algorithms. Solutions typically employ algorithms that incentivise environment exploration in different ways.
For the state-of-the art tables, please consult the parent Atari Games task.
( Image credit: Q-map )
|TREND||DATASET||BEST METHOD||PAPER TITLE||PAPER||CODE||COMPARE|
Model-based reinforcement learning (RL) is appealing because (i) it enables planning and thus more strategic exploration, and (ii) by decoupling dynamics from rewards, it enables fast transfer to new reward functions.
The deep reinforcement learning community has made several independent improvements to the DQN algorithm.
Go-Explore can also harness human-provided domain knowledge and, when augmented with it, scores a mean of over 650k points on Montezuma's Revenge.
Ranked #1 on Atari Games on Atari 2600 Pitfall!
However, many of the state of the art deep reinforcement learning algorithms, that rely on epsilon-greedy, fail on these environments.
We consider an agent's uncertainty about its environment and the problem of generalizing this uncertainty across observations.
Ranked #7 on Atari Games on Atari 2600 Montezuma's Revenge
One successful method of guiding exploration in these domains is to imitate trajectories provided by a human demonstrator.
Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms.