Browse > Playing Games > Atari Games > Montezuma's Revenge

Montezuma's Revenge

5 papers with code · Playing Games
Subtask of Atari Games

Montezuma's Revenge is an ATARI 2600 Benchmark game that is known to be difficult to perform on for reinforcement learning algorithms. Solutions typically employ algorithms that incentivise environment exploration in different ways.

For the state-of-the art tables, please consult the parent Atari Games task.

State-of-the-art leaderboards

No evaluation results yet. Help compare methods by submit evaluation metrics.

Greatest papers with code

Empowerment-driven Exploration using Mutual Information Estimation

11 Oct 2018navneet-nmk/pytorch-rl

However, many of the state of the art deep reinforcement learning algorithms, that rely on epsilon-greedy, fail on these environments. We demonstrate that an empowerment driven agent is able to improve significantly the score of a baseline DQN agent on the game of Montezuma's Revenge.

MONTEZUMA'S REVENGE

Go-Explore: a New Approach for Hard-Exploration Problems

30 Jan 2019uber-research/go-explore

Go-Explore can also harness human-provided domain knowledge and, when augmented with it, scores a mean of over 650k points on Montezuma's Revenge. On Pitfall, Go-Explore with domain knowledge is the first algorithm to score above zero.

IMITATION LEARNING MONTEZUMA'S REVENGE

Q-map: a Convolutional Approach for Goal-Oriented Reinforcement Learning

ICLR 2019 fabiopardo/qmap

We show how this network can be efficiently trained with a 3D variant of Q-learning to update the estimates towards all goals at once. While the Q-map agent could be used for a wide range of applications, we propose a novel exploration mechanism in place of epsilon-greedy that relies on goal selection at a desired distance followed by several steps taken towards it, allowing long and coherent exploratory steps in the environment.

MONTEZUMA'S REVENGE Q-LEARNING SNES GAMES

Playing hard exploration games by watching YouTube

NeurIPS 2018 MaxSobolMark/HardRLWithYoutube

One successful method of guiding exploration in these domains is to imitate trajectories provided by a human demonstrator. and Private Eye for the first time, even if the agent is not presented with any environment rewards.

MONTEZUMA'S REVENGE

Exploration by Random Network Distillation

30 Oct 2018DuaneNielsen/rnd

The bonus is the error of a neural network predicting features of the observations given by a fixed randomly initialized neural network. In particular we establish state of the art performance on Montezuma's Revenge, a game famously difficult for deep reinforcement learning methods.

MONTEZUMA'S REVENGE