no code implementations • 9 Feb 2023 • Andreea Deac, Théophane Weber, George Papamakarios
Model-based reinforcement learning algorithms, such as the highly successful MuZero, aim to accomplish this by learning a world model.
Model-based Reinforcement Learning reinforcement-learning +2
no code implementations • 8 Feb 2023 • Jacob Walker, Eszter Vértes, Yazhe Li, Gabriel Dulac-Arnold, Ankesh Anand, Théophane Weber, Jessica B. Hamrick
Our results show that intrinsic exploration combined with environment models present a viable direction towards agents that are self-supervised and able to generalize to novel reward functions.
no code implementations • 13 Jan 2023 • Pol Moreno, Adam R. Kosiorek, Heiko Strathmann, Daniel Zoran, Rosalia G. Schneider, Björn Winckler, Larisa Markeeva, Théophane Weber, Danilo J. Rezende
NeRF provides unparalleled fidelity of novel view synthesis: rendering a 3D scene from an arbitrary viewpoint.
no code implementations • 10 Jun 2022 • Peter C. Humphreys, Arthur Guez, Olivier Tieleman, Laurent SIfre, Théophane Weber, Timothy Lillicrap
Effective decision making involves flexibly relating past experiences and relevant contextual information to a novel situation.
no code implementations • ICLR 2022 • Ankesh Anand, Jacob Walker, Yazhe Li, Eszter Vértes, Julian Schrittwieser, Sherjil Ozair, Théophane Weber, Jessica B. Hamrick
One of the key promises of model-based reinforcement learning is the ability to generalize using an internal model of the world to make predictions in novel environments and tasks.
Ranked #1 on Meta-Learning on ML10 (Meta-test success rate (zero-shot) metric)
no code implementations • 3 Feb 2021 • Pol Moreno, Edward Hughes, Kevin R. McKee, Bernardo Avila Pires, Théophane Weber
We also show that higher-order belief models outperform agents with lower-order models.
no code implementations • 18 Nov 2020 • Thomas Mesnard, Théophane Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Tom Stepleton, Nicolas Heess, Arthur Guez, Éric Moulines, Marcus Hutter, Lars Buesing, Rémi Munos
Credit assignment in reinforcement learning is the problem of measuring an action's influence on future rewards.
no code implementations • ICLR 2021 • Jessica B. Hamrick, Abram L. Friesen, Feryal Behbahani, Arthur Guez, Fabio Viola, Sims Witherspoon, Thomas Anthony, Lars Buesing, Petar Veličković, Théophane Weber
These results indicate where and how to utilize planning in reinforcement learning settings, and highlight a number of open questions for future MBRL research.
Model-based Reinforcement Learning reinforcement-learning +1
1 code implementation • 11 Sep 2020 • Mehdi Mirza, Andrew Jaegle, Jonathan J. Hunt, Arthur Guez, Saran Tunyasuvunakool, Alistair Muldal, Théophane Weber, Peter Karkus, Sébastien Racanière, Lars Buesing, Timothy Lillicrap, Nicolas Heess
To encourage progress towards this goal we introduce a set of physically embedded planning problems and make them publicly available.
no code implementations • NeurIPS 2020 • Arthur Guez, Fabio Viola, Théophane Weber, Lars Buesing, Steven Kapturowski, Doina Precup, David Silver, Nicolas Heess
Value estimation is a critical component of the reinforcement learning (RL) paradigm.
1 code implementation • ICLR 2019 • Arthur Guez, Mehdi Mirza, Karol Gregor, Rishabh Kabra, Sébastien Racanière, Théophane Weber, David Raposo, Adam Santoro, Laurent Orseau, Tom Eccles, Greg Wayne, David Silver, Timothy Lillicrap
The field of reinforcement learning (RL) is facing increasingly challenging domains with combinatorial complexity.
no code implementations • 7 Jan 2019 • Théophane Weber, Nicolas Heess, Lars Buesing, David Silver
Stochastic computation graphs (SCGs) provide a formalism to represent structured optimization problems arising in artificial intelligence, including supervised, unsupervised, and reinforcement learning.
1 code implementation • NeurIPS 2018 • Laurent Orseau, Levi H. S. Lelis, Tor Lattimore, Théophane Weber
We introduce two novel tree search algorithms that use a policy to guide search.
2 code implementations • ICML 2018 • Arthur Guez, Théophane Weber, Ioannis Antonoglou, Karen Simonyan, Oriol Vinyals, Daan Wierstra, Rémi Munos, David Silver
They are most typically solved by tree search algorithms that simulate ahead into the future, evaluate future states, and back-up those evaluations to the root of a search tree.
2 code implementations • 19 Jul 2017 • Razvan Pascanu, Yujia Li, Oriol Vinyals, Nicolas Heess, Lars Buesing, Sebastien Racanière, David Reichert, Théophane Weber, Daan Wierstra, Peter Battaglia
Here we introduce the "Imagination-based Planner", the first model-based, sequential decision-making agent that can learn to construct, evaluate, and execute plans.
2 code implementations • NeurIPS 2017 • Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, Demis Hassabis, David Silver, Daan Wierstra
We introduce Imagination-Augmented Agents (I2As), a novel architecture for deep reinforcement learning combining model-free and model-based aspects.
Model-based Reinforcement Learning reinforcement-learning +1