no code implementations • 13 Jan 2022 • Nenad Tomasev, Ioana Bica, Brian McWilliams, Lars Buesing, Razvan Pascanu, Charles Blundell, Jovana Mitrovic
Most notably, ReLICv2 is the first unsupervised representation learning method to consistently outperform the supervised baseline in a like-for-like comparison over a range of ResNet architectures.
Ranked #14 on Semantic Segmentation on PASCAL VOC 2012 val
no code implementations • 18 Nov 2020 • Thomas Mesnard, Théophane Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Tom Stepleton, Nicolas Heess, Arthur Guez, Éric Moulines, Marcus Hutter, Lars Buesing, Rémi Munos
Credit assignment in reinforcement learning is the problem of measuring an action's influence on future rewards.
no code implementations • ICLR 2021 • Jessica B. Hamrick, Abram L. Friesen, Feryal Behbahani, Arthur Guez, Fabio Viola, Sims Witherspoon, Thomas Anthony, Lars Buesing, Petar Veličković, Théophane Weber
These results indicate where and how to utilize planning in reinforcement learning settings, and highlight a number of open questions for future MBRL research.
Model-based Reinforcement Learning reinforcement-learning +1
2 code implementations • 15 Oct 2020 • Jovana Mitrovic, Brian McWilliams, Jacob Walker, Lars Buesing, Charles Blundell
Self-supervised learning has emerged as a strategy to reduce the reliance on costly supervised signal by pretraining representations only using unlabeled data.
Ranked #77 on Self-Supervised Image Classification on ImageNet
no code implementations • 3 Oct 2020 • Peter Karkus, Mehdi Mirza, Arthur Guez, Andrew Jaegle, Timothy Lillicrap, Lars Buesing, Nicolas Heess, Theophane Weber
We explore whether integrated tasks like Mujoban can be solved by composing RL modules together in a sense-plan-act hierarchy, where modules have well-defined roles similarly to classic robot architectures.
1 code implementation • 11 Sep 2020 • Mehdi Mirza, Andrew Jaegle, Jonathan J. Hunt, Arthur Guez, Saran Tunyasuvunakool, Alistair Muldal, Théophane Weber, Peter Karkus, Sébastien Racanière, Lars Buesing, Timothy Lillicrap, Nicolas Heess
To encourage progress towards this goal we introduce a set of physically embedded planning problems and make them publicly available.
no code implementations • NeurIPS 2020 • Petar Veličković, Lars Buesing, Matthew C. Overlan, Razvan Pascanu, Oriol Vinyals, Charles Blundell
This static input structure is often informed purely by insight of the machine learning practitioner, and might not be optimal for the actual task the GNN is solving.
no code implementations • 23 Apr 2020 • Giambattista Parascandolo, Lars Buesing, Josh Merel, Leonard Hasenclever, John Aslanides, Jessica B. Hamrick, Nicolas Heess, Alexander Neitz, Theophane Weber
are constrained by an implicit sequential planning assumption: The order in which a plan is constructed is the same in which it is executed.
no code implementations • NeurIPS 2020 • Arthur Guez, Fabio Viola, Théophane Weber, Lars Buesing, Steven Kapturowski, Doina Precup, David Silver, Nicolas Heess
Value estimation is a critical component of the reinforcement learning (RL) paradigm.
no code implementations • 7 Feb 2020 • Danilo J. Rezende, Ivo Danihelka, George Papamakarios, Nan Rosemary Ke, Ray Jiang, Theophane Weber, Karol Gregor, Hamza Merzic, Fabio Viola, Jane Wang, Jovana Mitrovic, Frederic Besse, Ioannis Antonoglou, Lars Buesing
In reinforcement learning, we can learn a model of future observations and rewards, and use it to plan the agent's next actions.
no code implementations • ICLR 2020 • Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Tobias Pfaff, Theophane Weber, Lars Buesing, Peter W. Battaglia
In SAVE, a learned prior over state-action values is used to guide MCTS, which estimates an improved set of state-action values.
1 code implementation • 15 Oct 2019 • Lars Buesing, Nicolas Heess, Theophane Weber
A plethora of problems in AI, engineering and the sciences are naturally formalized as inference in discrete probabilistic models.
no code implementations • 7 Jan 2019 • Théophane Weber, Nicolas Heess, Lars Buesing, David Silver
Stochastic computation graphs (SCGs) provide a formalism to represent structured optimization problems arising in artificial intelligence, including supervised, unsupervised, and reinforcement learning.
no code implementations • ICLR 2019 • Lars Buesing, Theophane Weber, Yori Zwols, Sebastien Racaniere, Arthur Guez, Jean-Baptiste Lespiau, Nicolas Heess
In contrast to off-policy algorithms based on Importance Sampling which re-weight data, CF-GPS leverages a model to explicitly consider alternative outcomes, allowing the algorithm to make better use of experience data.
1 code implementation • ICLR 2019 • Karol Gregor, George Papamakarios, Frederic Besse, Lars Buesing, Theophane Weber
To act and plan in complex environments, we posit that agents should have a mental simulator of the world with three characteristics: (a) it should build an abstract state representing the condition of the world; (b) it should form a belief which represents uncertainty on the world; (c) it should go beyond simple step-by-step simulation, and exhibit temporal abstraction.
no code implementations • 8 Feb 2018 • Lars Buesing, Theophane Weber, Sebastien Racaniere, S. M. Ali Eslami, Danilo Rezende, David P. Reichert, Fabio Viola, Frederic Besse, Karol Gregor, Demis Hassabis, Daan Wierstra
A key challenge in model-based reinforcement learning (RL) is to synthesize computationally efficient and accurate environment models.
no code implementations • ICLR 2018 • Lars Buesing, Theophane Weber, Sebastien Racaniere, S. M. Ali Eslami, Danilo Rezende, David Reichert, Fabio Viola, Frederic Besse, Karol Gregor, Demis Hassabis, Daan Wierstra
A key challenge in model-based reinforcement learning (RL) is to synthesize computationally efficient and accurate environment models.
no code implementations • NeurIPS 2017 • Artur Speiser, Jinyao Yan, Evan Archer, Lars Buesing, Srinivas C. Turaga, Jakob H. Macke
Calcium imaging permits optical measurement of neural activity.
2 code implementations • 19 Jul 2017 • Razvan Pascanu, Yujia Li, Oriol Vinyals, Nicolas Heess, Lars Buesing, Sebastien Racanière, David Reichert, Théophane Weber, Daan Wierstra, Peter Battaglia
Here we introduce the "Imagination-based Planner", the first model-based, sequential decision-making agent that can learn to construct, evaluate, and execute plans.
2 code implementations • NeurIPS 2017 • Théophane Weber, Sébastien Racanière, David P. Reichert, Lars Buesing, Arthur Guez, Danilo Jimenez Rezende, Adria Puigdomènech Badia, Oriol Vinyals, Nicolas Heess, Yujia Li, Razvan Pascanu, Peter Battaglia, Demis Hassabis, David Silver, Daan Wierstra
We introduce Imagination-Augmented Agents (I2As), a novel architecture for deep reinforcement learning combining model-free and model-based aspects.
Model-based Reinforcement Learning reinforcement-learning +1
1 code implementation • 23 Nov 2015 • Evan Archer, Il Memming Park, Lars Buesing, John Cunningham, Liam Paninski
These models have the advantage of learning latent structure both from noisy observations and from the temporal ordering in the data, where it is assumed that meaningful correlation structure exists across time.
no code implementations • NeurIPS 2014 • Lars Buesing, Timothy A. Machado, John P. Cunningham, Liam Paninski
High-dimensional, simultaneous recordings of neural spiking activity are often explored, analyzed and visualized with the help of latent variable or factor models.
no code implementations • NeurIPS 2015 • Mijung Park, Wittawat Jitkrittum, Ahmad Qamar, Zoltan Szabo, Lars Buesing, Maneesh Sahani
We introduce the Locally Linear Latent Variable Model (LL-LVM), a probabilistic model for non-linear manifold discovery that describes a joint distribution over observations, their manifold coordinates and locally linear maps conditioned on a set of neighbourhood relationships.
no code implementations • NeurIPS 2013 • Srini Turaga, Lars Buesing, Adam M. Packer, Henry Dalgleish, Noah Pettit, Michael Hausser, Jakob H. Macke
Simultaneous recordings of the activity of large neural populations are extremely valuable as they can be used to infer the dynamics and interactions of neurons in a local circuit, shedding light on the computations performed.
no code implementations • NeurIPS 2012 • Lars Buesing, Jakob H. Macke, Maneesh Sahani
Here, we show how spectral learning methods for linear systems with Gaussian observations (usually called subspace identification in this context) can be extended to estimate the parameters of dynamical system models observed through non-Gaussian noise models.
no code implementations • NeurIPS 2011 • Jakob H. Macke, Lars Buesing, John P. Cunningham, Byron M. Yu, Krishna V. Shenoy, Maneesh Sahani
Neurons in the neocortex code and compute as part of a locally interconnected population.
no code implementations • NeurIPS 2007 • Lars Buesing, Wolfgang Maass
We show that under suitable assumptions (primarily linearization) a simple and perspicuous online learning rule for Information Bottleneck optimization with spiking neurons can be derived.