NetHack

16 papers with code • 0 benchmarks • 0 datasets

Mean in-game score over 1000 episodes with random seeds not seen during training. See https://arxiv.org/abs/2006.13760 (Section 2.4 Evaluation Protocol) for details.

Benchmarks

Add a Result

These leaderboards are used to track progress in NetHack

You can find evaluation results in the subtasks. You can also submitting evaluation metrics for this task.

Libraries

Use these libraries to find NetHack models and implementations

facebookresearch/minihack

3 papers

448

Subtasks

NetHack Score

Latest papers

Most implemented Social Latest No code

Playing NetHack with LLMs: Potential & Limitations as Zero-Shot Agents

commandercero/netplay • 1 Mar 2024

In contrast, agents tested in dynamic robot environments face limitations due to simplistic environments with only a few objects and interactions.

01 Mar 2024

Paper
Code

Craftax: A Lightning-Fast Benchmark for Open-Ended Reinforcement Learning

michaeltmatthews/craftax • • 26 Feb 2024

Either they are too slow for meaningful research to be performed without enormous computational resources, like Crafter, NetHack and Minecraft, or they are not complex enough to pose a significant challenge, like Minigrid and Procgen.

116

26 Feb 2024

Paper
Code

diff History for Neural Language Agents

upiterbarg/diff_history • 12 Dec 2023

On NetHack, an unsolved video game that requires long-horizon reasoning for decision-making, LMs tuned with diff history match state-of-the-art performance for neural agents while needing 1800x fewer training examples compared to prior work.

12 Dec 2023

Paper
Code

Motif: Intrinsic Motivation from Artificial Intelligence Feedback

facebookresearch/motif • • 29 Sep 2023

Exploring rich environments and evaluating one's actions without prior knowledge is immensely challenging.

112

29 Sep 2023

Paper
Code

LuckyMera: a Modular AI Framework for Building Hybrid NetHack Agents

pervasive-ai-lab/luckymera • • 17 Jul 2023

In the last few decades we have witnessed a significant development in Artificial Intelligence (AI) thanks to the availability of a variety of testbeds, mostly based on simulated environments and video games.

17 Jul 2023

Paper
Code

Katakomba: Tools and Benchmarks for Data-Driven NetHack

corl-team/katakomba • NeurIPS 2023

NetHack is known as the frontier of reinforcement learning research where learning-based methods still need to catch up to rule-based solutions.

14 Jun 2023

Paper
Code

Dungeons and Data: A Large-Scale NetHack Dataset

facebookresearch/nle • 1 Nov 2022

Recent breakthroughs in the development of agents to solve challenging sequential decision making problems such as Go, StarCraft, or DOTA, have relied on both simulated environments and large-scale datasets.

930

01 Nov 2022

Paper
Code

Improving Policy Learning via Language Dynamics Distillation

vzhong/language-dynamics-distillation • • 30 Sep 2022

Recent work has shown that augmenting environments with language descriptions improves policy learning.

30 Sep 2022

Paper
Code

Hierarchical Kickstarting for Skill Transfer in Reinforcement Learning

facebookresearch/minihack • • 23 Jul 2022

In this paper, we investigate how skills can be incorporated into the training of reinforcement learning (RL) agents in complex environments with large state-action spaces and sparse rewards.

448

23 Jul 2022

Paper
Code

Insights From the NeurIPS 2021 NetHack Challenge

dllllb/neurips2021-nethack-raph • 22 Mar 2022

In this report, we summarize the takeaways from the first NeurIPS 2021 NetHack Challenge.

22 Mar 2022

Paper
Code

NetHack

Benchmarks Add a Result

Libraries

Subtasks

Latest papers

Content

Benchmarks

Add a Result