Montezuma's Revenge is an ATARI 2600 Benchmark game that is known to be difficult to perform on for reinforcement learning algorithms. Solutions typically employ algorithms that incentivise environment exploration in different ways.
For the state-of-the art tables, please consult the parent Atari Games task.
Our work is a simple extension of the paper "Exploration by Random Network Distillation".
In particular we establish state of the art performance on Montezuma's Revenge, a game famously difficult for deep reinforcement learning methods.
However, many of the state of the art deep reinforcement learning algorithms, that rely on epsilon-greedy, fail on these environments.
One successful method of guiding exploration in these domains is to imitate trajectories provided by a human demonstrator.
We consider an agent's uncertainty about its environment and the problem of generalizing this uncertainty across observations.
#4 best model for Atari Games on Atari 2600 Montezuma's Revenge
Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms.