NetHack

16 papers with code • 0 benchmarks • 0 datasets

Mean in-game score over 1000 episodes with random seeds not seen during training. See https://arxiv.org/abs/2006.13760 (Section 2.4 Evaluation Protocol) for details.

Benchmarks

Add a Result

These leaderboards are used to track progress in NetHack

You can find evaluation results in the subtasks. You can also submitting evaluation metrics for this task.

Libraries

Use these libraries to find NetHack models and implementations

facebookresearch/minihack

3 papers

448

Subtasks

NetHack Score

Latest papers with no code

Most implemented Social Latest No code

Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills

no code yet • 5 Feb 2024

We evaluate our method in the classic videogame NetHack and the text environment ScienceWorld to demonstrate SSO's ability to optimize a set of skills and perform in-context policy improvement.

Paper
Add Code

Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem

no code yet • 5 Feb 2024

Fine-tuning is a widespread technique that allows practitioners to transfer pre-trained capabilities, as recently showcased by the successful applications of foundation models.

Paper
Add Code

Selective Perception: Optimizing State Descriptions with Reinforcement Learning for Language Model Actors

no code yet • 21 Jul 2023

Large language models (LLMs) are being applied as actors for sequential decision making tasks in domains such as robotics and games, utilizing their general world knowledge and planning abilities.

Paper
Add Code

Scaling Laws for Imitation Learning in Single-Agent Games

no code yet • 18 Jul 2023

Inspired by recent work in Natural Language Processing (NLP) where "scaling up" has resulted in increasingly more capable LLMs, we investigate whether carefully scaling up model and data size can bring similar improvements in the imitation learning setting for single-agent games.

Paper
Add Code

Accelerating exploration and representation learning with offline pre-training

no code yet • 31 Mar 2023

In this work, we follow the hypothesis that exploration and representation learning can be improved by separately learning two different models from a single offline dataset.

Paper
Add Code

SILG: The Multi-domain Symbolic Interactive Language Grounding Benchmark

no code yet • NeurIPS 2021

We hope SILG enables the community to quickly identify new methodolo- gies for language grounding that generalize to a diverse set of environments and their associated challenges.

Paper
Add Code

Exploration in NetHack With Secret Discovery

no code yet • 8 Nov 2017

Our algorithm is based on the concept of occupancy maps popular in robotics, adapted to encourage efficient discovery of secret access points.

Paper
Add Code

NetHack

Benchmarks Add a Result

Libraries

Subtasks

Latest papers with no code

Skill Set Optimization: Reinforcing Language Model Behavior via Transferable Skills

Fine-tuning Reinforcement Learning Models is Secretly a Forgetting Mitigation Problem

Selective Perception: Optimizing State Descriptions with Reinforcement Learning for Language Model Actors

Scaling Laws for Imitation Learning in Single-Agent Games

Accelerating exploration and representation learning with offline pre-training

SILG: The Multi-domain Symbolic Interactive Language Grounding Benchmark

Exploration in NetHack With Secret Discovery

Content

Benchmarks

Add a Result