1 code implementation • 12 Dec 2023 • Ulyana Piterbarg, Lerrel Pinto, Rob Fergus
On NetHack, an unsolved video game that requires long-horizon reasoning for decision-making, LMs tuned with diff history match state-of-the-art performance for neural agents while needing 1800x fewer training examples compared to prior work.