no code implementations • ICML 2020 • Roberta Raileanu, Max Goldstein, Arthur Szlam, Facebook Rob Fergus
An ensemble of conventional RL policies is used to gather experience on training environments, from which embeddings of both policies and environments can be learned.
no code implementations • 20 Sep 2023 • Shehzaad Dhuliawala, Mojtaba Komeili, Jing Xu, Roberta Raileanu, Xian Li, Asli Celikyilmaz, Jason Weston
Generation of plausible yet incorrect factual information, termed hallucination, is an unsolved issue in large language models.
no code implementations • 19 Jul 2023 • Jean Kaddour, Joshua Harris, Maximilian Mozes, Herbie Bradley, Roberta Raileanu, Robert McHardy
Due to the fast pace of the field, it is difficult to identify the remaining challenges and already fruitful application areas.
no code implementations • 3 Jul 2023 • Yihong Chen, Kelly Marchisio, Roberta Raileanu, David Ifeoluwa Adelani, Pontus Stenetorp, Sebastian Riedel, Mikel Artetxe
Pretrained language models (PLMs) are today the primary model for natural language processing.
1 code implementation • 8 Jun 2023 • Yiding Jiang, J. Zico Kolter, Roberta Raileanu
Existing approaches for improving generalization in deep reinforcement learning (RL) have mostly focused on representation learning, neglecting RL-specific aspects such as exploration.
2 code implementations • 5 Jun 2023 • Mikael Henaff, Minqi Jiang, Roberta Raileanu
This results in an algorithm which sets a new state of the art across 16 tasks from the MiniHack suite used in prior work, and also performs robustly on Habitat and Montezuma's Revenge.
1 code implementation • 2 Jun 2023 • Theresa Eimer, Marius Lindauer, Roberta Raileanu
In order to improve reproducibility, deep reinforcement learning (RL) has been adopting better scientific practices such as standardized evaluation metrics and reporting.
no code implementations • 6 Mar 2023 • Mikayel Samvelyan, Akbir Khan, Michael Dennis, Minqi Jiang, Jack Parker-Holder, Jakob Foerster, Roberta Raileanu, Tim Rocktäschel
Open-ended learning methods that automatically generate a curriculum of increasingly challenging tasks serve as a promising avenue toward generally capable reinforcement learning agents.
1 code implementation • 15 Feb 2023 • Grégoire Mialon, Roberto Dessì, Maria Lomeli, Christoforos Nalmpantis, Ram Pasunuru, Roberta Raileanu, Baptiste Rozière, Timo Schick, Jane Dwivedi-Yu, Asli Celikyilmaz, Edouard Grave, Yann Lecun, Thomas Scialom
This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools.
2 code implementations • 9 Feb 2023 • Timo Schick, Jane Dwivedi-Yu, Roberto Dessì, Roberta Raileanu, Maria Lomeli, Luke Zettlemoyer, Nicola Cancedda, Thomas Scialom
Language models (LMs) exhibit remarkable abilities to solve new tasks from just a few examples or textual instructions, especially at scale.
Ranked #9 on
Math Word Problem Solving
on MAWPS
1 code implementation • 18 Nov 2022 • Jean-Baptiste Gaya, Thang Doan, Lucas Caccia, Laure Soulier, Ludovic Denoyer, Roberta Raileanu
We introduce Continual Subspace of Policies (CSP), a new approach that incrementally builds a subspace of policies for training a reinforcement learning agent on a sequence of tasks.
1 code implementation • 1 Nov 2022 • Eric Hambro, Roberta Raileanu, Danielle Rothermel, Vegard Mella, Tim Rocktäschel, Heinrich Küttler, Naila Murray
Recent breakthroughs in the development of agents to solve challenging sequential decision making problems such as Go, StarCraft, or DOTA, have relied on both simulated environments and large-scale datasets.
2 code implementations • 11 Oct 2022 • Mikael Henaff, Roberta Raileanu, Minqi Jiang, Tim Rocktäschel
In recent years, a number of reinforcement learning (RL) methods have been proposed to explore complex environments which differ across episodes.
no code implementations • 22 Mar 2022 • Eric Hambro, Sharada Mohanty, Dmitrii Babaev, Minwoo Byeon, Dipam Chakraborty, Edward Grefenstette, Minqi Jiang, DaeJin Jo, Anssi Kanervisto, Jongmin Kim, Sungwoong Kim, Robert Kirk, Vitaly Kurin, Heinrich Küttler, Taehwon Kwon, Donghoon Lee, Vegard Mella, Nantas Nardelli, Ivan Nazarov, Nikita Ovsov, Jack Parker-Holder, Roberta Raileanu, Karolis Ramanauskas, Tim Rocktäschel, Danielle Rothermel, Mikayel Samvelyan, Dmitry Sorokin, Maciej Sypetkowski, Michał Sypetkowski
In this report, we summarize the takeaways from the first NeurIPS 2021 NetHack Challenge.
1 code implementation • 17 Feb 2022 • Jesse Mu, Victor Zhong, Roberta Raileanu, Minqi Jiang, Noah Goodman, Tim Rocktäschel, Edward Grefenstette
Reinforcement learning (RL) agents are particularly hard to train when rewards are sparse.
1 code implementation • NeurIPS 2021 • Roberta Raileanu, Maxwell Goldstein, Denis Yarats, Ilya Kostrikov, Rob Fergus
Deep reinforcement learning (RL) agents often fail to generalize beyond their training environments.
1 code implementation • 27 Jul 2021 • Open Ended Learning Team, Adam Stooke, Anuj Mahajan, Catarina Barros, Charlie Deck, Jakob Bauer, Jakub Sygnowski, Maja Trebacz, Max Jaderberg, Michael Mathieu, Nat McAleese, Nathalie Bradley-Schmieg, Nathaniel Wong, Nicolas Porcel, Roberta Raileanu, Steph Hughes-Fitt, Valentin Dalibard, Wojciech Marian Czarnecki
The resulting space is exceptionally diverse in terms of the challenges posed to agents, and as such, even measuring the learning progress of an agent is an open research problem.
1 code implementation • 20 Feb 2021 • Roberta Raileanu, Rob Fergus
Standard deep reinforcement learning algorithms use a shared representation for the policy and value function, especially when training directly from images.
1 code implementation • 6 Jul 2020 • Roberta Raileanu, Max Goldstein, Arthur Szlam, Rob Fergus
An ensemble of conventional RL policies is used to gather experience on training environments, from which embeddings of both policies and environments can be learned.
3 code implementations • NeurIPS 2020 • Heinrich Küttler, Nantas Nardelli, Alexander H. Miller, Roberta Raileanu, Marco Selvatici, Edward Grefenstette, Tim Rocktäschel
Here, we present the NetHack Learning Environment (NLE), a scalable, procedurally generated, stochastic, rich, and challenging environment for RL research based on the popular single-player terminal-based roguelike game, NetHack.
Ranked #1 on
NetHack Score
on NetHack Learning Environment
1 code implementation • NeurIPS 2021 • Roberta Raileanu, Max Goldstein, Denis Yarats, Ilya Kostrikov, Rob Fergus
Our agent outperforms other baselines specifically designed to improve generalization in RL.
5 code implementations • ICLR 2021 • Andres Campero, Roberta Raileanu, Heinrich Küttler, Joshua B. Tenenbaum, Tim Rocktäschel, Edward Grefenstette
A key challenge for reinforcement learning (RL) consists of learning in environments with sparse extrinsic rewards.
2 code implementations • ICLR 2020 • Roberta Raileanu, Tim Rocktäschel
However, we show that existing methods fall short in procedurally-generated environments where an agent is unlikely to visit a state more than once.
no code implementations • ICLR 2019 • Cinjon Resnick, Roberta Raileanu, Sanyam Kapoor, Alexander Peysakhovich, Kyunghyun Cho, Joan Bruna
Our contributions are that we analytically characterize the types of environments where Backplay can improve training speed, demonstrate the effectiveness of Backplay both in large grid worlds and a complex four player zero-sum game (Pommerman), and show that Backplay compares favorably to other competitive methods known to improve sample efficiency.
1 code implementation • 18 Jul 2018 • Cinjon Resnick, Roberta Raileanu, Sanyam Kapoor, Alexander Peysakhovich, Kyunghyun Cho, Joan Bruna
Our contributions are that we analytically characterize the types of environments where Backplay can improve training speed, demonstrate the effectiveness of Backplay both in large grid worlds and a complex four player zero-sum game (Pommerman), and show that Backplay compares favorably to other competitive methods known to improve sample efficiency.
1 code implementation • ICML 2018 • Roberta Raileanu, Emily Denton, Arthur Szlam, Rob Fergus
We consider the multi-agent reinforcement learning setting with imperfect information in which each agent is trying to maximize its own utility.
Multi-agent Reinforcement Learning
reinforcement-learning
+1