Montezuma's Revenge
28 papers with code • 1 benchmarks • 1 datasets
Montezuma's Revenge is an ATARI 2600 Benchmark game that is known to be difficult to perform on for reinforcement learning algorithms. Solutions typically employ algorithms that incentivise environment exploration in different ways.
For the state-of-the art tables, please consult the parent Atari Games task.
( Image credit: Q-map )
Latest papers with no code
Creativity of AI: Hierarchical Planning Model Learning for Facilitating Deep Reinforcement Learning
Despite of achieving great success in real-world applications, Deep Reinforcement Learning (DRL) is still suffering from three critical issues, i. e., data efficiency, lack of the interpretability and transferability.
Entropic Desired Dynamics for Intrinsic Control
An agent might be said, informally, to have mastery of its environment when it has maximised the effective number of states it can reliably reach.
On Bonus-Based Exploration Methods in the Arcade Learning Environment
Research on exploration in reinforcement learning, as applied to Atari 2600 game-playing, has emphasized tackling difficult exploration problems such as Montezuma's Revenge (Bellemare et al., 2016).
Bridging the Gap: Providing Post-Hoc Symbolic Explanations for Sequential Decision-Making Problems with Inscrutable Representations
As increasingly complex AI systems are introduced into our daily lives, it becomes important for such systems to be capable of explaining the rationale for their decisions and allowing users to contest these decisions.
MIME: Mutual Information Minimisation Exploration
We show that reinforcement learning agents that learn by surprise (surprisal) get stuck at abrupt environmental transition boundaries because these transitions are difficult to learn.
On Bonus Based Exploration Methods In The Arcade Learning Environment
Research on exploration in reinforcement learning, as applied to Atari 2600 game-playing, has emphasized tackling difficult exploration problems such as Montezuma's Revenge (Bellemare et al., 2016).
Benchmarking Bonus-Based Exploration Methods on the Arcade Learning Environment
This paper provides an empirical evaluation of recently developed exploration algorithms within the Arcade Learning Environment (ALE).
Memory Based Trajectory-conditioned Policies for Learning from Sparse Rewards
Reinforcement learning with sparse rewards is challenging because an agent can rarely obtain non-zero rewards and hence, gradient-based optimization of parameterized policies can be incremental and slow.
Learning and Exploiting Multiple Subgoals for Fast Exploration in Hierarchical Reinforcement Learning
To achieve fast exploration without using manual design, we devise a multi-goal HRL algorithm, consisting of a high-level policy Manager and a low-level policy Worker.
Escape Room: A Configurable Testbed for Hierarchical Reinforcement Learning
We show that the ERD presents a suite of challenges with scalable difficulty to provide a smooth learning gradient from Taxi to the Arcade Learning Environment.