About

Montezuma's Revenge is an ATARI 2600 Benchmark game that is known to be difficult to perform on for reinforcement learning algorithms. Solutions typically employ algorithms that incentivise environment exploration in different ways.

For the state-of-the art tables, please consult the parent Atari Games task.

( Image credit: Q-map )

Benchmarks

TREND DATASET BEST METHOD PAPER TITLE PAPER CODE COMPARE

Greatest papers with code

Learning Abstract Models for Strategic Exploration and Fast Reward Transfer

12 Jul 2020google-research/google-research

Model-based reinforcement learning (RL) is appealing because (i) it enables planning and thus more strategic exploration, and (ii) by decoupling dynamics from rewards, it enables fast transfer to new reward functions.

MODEL-BASED REINFORCEMENT LEARNING MONTEZUMA'S REVENGE

Rainbow: Combining Improvements in Deep Reinforcement Learning

6 Oct 2017facebookresearch/ReAgent

The deep reinforcement learning community has made several independent improvements to the DQN algorithm.

MONTEZUMA'S REVENGE

Exploration by Random Network Distillation

30 Oct 2018openai/random-network-distillation

In particular we establish state of the art performance on Montezuma's Revenge, a game famously difficult for deep reinforcement learning methods.

MONTEZUMA'S REVENGE

First return, then explore

27 Apr 2020uber-research/go-explore

The promise of reinforcement learning is to solve complex sequential decision problems by specifying a high-level reward function only.

MONTEZUMA'S REVENGE

Go-Explore: a New Approach for Hard-Exploration Problems

30 Jan 2019uber-research/go-explore

Go-Explore can also harness human-provided domain knowledge and, when augmented with it, scores a mean of over 650k points on Montezuma's Revenge.

IMITATION LEARNING MONTEZUMA'S REVENGE

Empowerment-driven Exploration using Mutual Information Estimation

11 Oct 2018navneet-nmk/pytorch-rl

However, many of the state of the art deep reinforcement learning algorithms, that rely on epsilon-greedy, fail on these environments.

MONTEZUMA'S REVENGE MUTUAL INFORMATION ESTIMATION

Unifying Count-Based Exploration and Intrinsic Motivation

NeurIPS 2016 RLAgent/state-marginal-matching

We consider an agent's uncertainty about its environment and the problem of generalizing this uncertainty across observations.

MONTEZUMA'S REVENGE

Scaling All-Goals Updates in Reinforcement Learning Using Convolutional Neural Networks

ICLR 2019 fabiopardo/qmap

Being able to reach any desired location in the environment can be a valuable asset for an agent.

MONTEZUMA'S REVENGE Q-LEARNING SNES GAMES

Playing hard exploration games by watching YouTube

NeurIPS 2018 MaxSobolMark/HardRLWithYoutube

One successful method of guiding exploration in these domains is to imitate trajectories provided by a human demonstrator.

MONTEZUMA'S REVENGE

Hierarchical Deep Reinforcement Learning: Integrating Temporal Abstraction and Intrinsic Motivation

NeurIPS 2016 transedward/pytorch-hdqn

Learning goal-directed behavior in environments with sparse feedback is a major challenge for reinforcement learning algorithms.

MONTEZUMA'S REVENGE