Search Results for author: Theophane Weber

Found 26 papers, 10 papers with code

DiscoGen: Learning to Discover Gene Regulatory Networks

no code implementations • 12 Apr 2023 • Nan Rosemary Ke, Sara-Jane Dunn, Jorg Bornschein, Silvia Chiappa, Melanie Rey, Jean-Baptiste Lespiau, Albin Cassirer, Jane Wang, Theophane Weber, David Barrett, Matthew Botvinick, Anirudh Goyal, Mike Mozer, Danilo Rezende

To accurately identify GRNs, perturbational data is required.

Causal Discovery

Paper
Add Code

Learning to Induce Causal Structure

no code implementations • 11 Apr 2022 • Nan Rosemary Ke, Silvia Chiappa, Jane Wang, Anirudh Goyal, Jorg Bornschein, Melanie Rey, Theophane Weber, Matthew Botvinic, Michael Mozer, Danilo Jimenez Rezende

The fundamental challenge in causal induction is to infer the underlying graph structure given observational and/or interventional data.

Paper
Add Code

Retrieval-Augmented Reinforcement Learning

no code implementations • 17 Feb 2022 • Anirudh Goyal, Abram L. Friesen, Andrea Banino, Theophane Weber, Nan Rosemary Ke, Adria Puigdomenech Badia, Arthur Guez, Mehdi Mirza, Peter C. Humphreys, Ksenia Konyushkova, Laurent SIfre, Michal Valko, Simon Osindero, Timothy Lillicrap, Nicolas Heess, Charles Blundell

In this paper we explore an alternative paradigm in which we train a network to map a dataset of past experiences to optimal behavior.

reinforcement-learning Reinforcement Learning (RL) +1

Paper
Add Code

Muesli: Combining Improvements in Policy Optimization

2 code implementations • 13 Apr 2021 • Matteo Hessel, Ivo Danihelka, Fabio Viola, Arthur Guez, Simon Schmitt, Laurent SIfre, Theophane Weber, David Silver, Hado van Hasselt

We propose a novel policy update that combines regularized policy optimization with model learning as an auxiliary loss.

Ranked #8 on Atari Games on atari game

Atari Games Continuous Control

Paper
Code

Synthetic Returns for Long-Term Credit Assignment

2 code implementations • 24 Feb 2021 • David Raposo, Sam Ritter, Adam Santoro, Greg Wayne, Theophane Weber, Matt Botvinick, Hado van Hasselt, Francis Song

We propose state-associative (SA) learning, where the agent learns associations between states and arbitrarily distant future rewards, then propagates credit directly between the two.

3,521

Paper
Code

Divide-and-Conquer Monte Carlo Tree Search

no code implementations • 1 Jan 2021 • Giambattista Parascandolo, Lars Holger Buesing, Josh Merel, Leonard Hasenclever, John Aslanides, Jessica B Hamrick, Nicolas Heess, Alexander Neitz, Theophane Weber

are constrained by an implicit sequential planning assumption: The order in which a plan is constructed is the same in which it is executed.

Continuous Control Decision Making +1

Paper
Add Code

Model-Free Counterfactual Credit Assignment

no code implementations • 1 Jan 2021 • Thomas Mesnard, Theophane Weber, Fabio Viola, Shantanu Thakoor, Alaa Saade, Anna Harutyunyan, Will Dabney, Tom Stepleton, Nicolas Heess, Marcus Hutter, Lars Holger Buesing, Remi Munos

Credit assignment in reinforcement learning is the problem of measuring an action’s influence on future rewards.

counterfactual valid

Paper
Add Code

A case for new neural network smoothness constraints

no code implementations • 14 Dec 2020 • Mihaela Rosca, Theophane Weber, Arthur Gretton, Shakir Mohamed

How sensitive should machine learning models be to input changes?

Adversarial Robustness BIG-bench Machine Learning +3

Paper
Add Code

A case for new neural networks smoothness constraints

no code implementations • NeurIPS Workshop ICBINB 2020 • Mihaela Rosca, Theophane Weber, Arthur Gretton, Shakir Mohamed

How sensitive should machine learning models be to input changes?

Adversarial Robustness Inductive Bias +2

Paper
Add Code

Beyond Tabula-Rasa: a Modular Reinforcement Learning Approach for Physically Embedded 3D Sokoban

no code implementations • 3 Oct 2020 • Peter Karkus, Mehdi Mirza, Arthur Guez, Andrew Jaegle, Timothy Lillicrap, Lars Buesing, Nicolas Heess, Theophane Weber

We explore whether integrated tasks like Mujoban can be solved by composing RL modules together in a sense-plan-act hierarchy, where modules have well-defined roles similarly to classic robot architectures.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Divide-and-Conquer Monte Carlo Tree Search For Goal-Directed Planning

no code implementations • 23 Apr 2020 • Giambattista Parascandolo, Lars Buesing, Josh Merel, Leonard Hasenclever, John Aslanides, Jessica B. Hamrick, Nicolas Heess, Alexander Neitz, Theophane Weber

are constrained by an implicit sequential planning assumption: The order in which a plan is constructed is the same in which it is executed.

Continuous Control Decision Making +1

Paper
Add Code

Causally Correct Partial Models for Reinforcement Learning

no code implementations • 7 Feb 2020 • Danilo J. Rezende, Ivo Danihelka, George Papamakarios, Nan Rosemary Ke, Ray Jiang, Theophane Weber, Karol Gregor, Hamza Merzic, Fabio Viola, Jane Wang, Jovana Mitrovic, Frederic Besse, Ioannis Antonoglou, Lars Buesing

In reinforcement learning, we can learn a model of future observations and rewards, and use it to plan the agent's next actions.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Combining Q-Learning and Search with Amortized Value Estimates

no code implementations • ICLR 2020 • Jessica B. Hamrick, Victor Bapst, Alvaro Sanchez-Gonzalez, Tobias Pfaff, Theophane Weber, Lars Buesing, Peter W. Battaglia

In SAVE, a learned prior over state-action values is used to guide MCTS, which estimates an improved set of state-action values.

Q-Learning

Paper
Add Code

Approximate Inference in Discrete Distributions with Monte Carlo Tree Search and Value Functions

1 code implementation • 15 Oct 2019 • Lars Buesing, Nicolas Heess, Theophane Weber

A plethora of problems in AI, engineering and the sciences are naturally formalized as inference in discrete probabilistic models.

Decision Making Decision Making Under Uncertainty

Paper
Code

Unsupervised Doodling and Painting with Improved SPIRAL

1 code implementation • 2 Oct 2019 • John F. J. Mellor, Eunbyung Park, Yaroslav Ganin, Igor Babuschkin, tejas kulkarni, Dan Rosenbaum, Andy Ballard, Theophane Weber, Oriol Vinyals, S. M. Ali Eslami

We investigate using reinforcement learning agents as generative models of images (extending arXiv:1804. 01118).

Paper
Code

Woulda, Coulda, Shoulda: Counterfactually-Guided Policy Search

no code implementations • ICLR 2019 • Lars Buesing, Theophane Weber, Yori Zwols, Sebastien Racaniere, Arthur Guez, Jean-Baptiste Lespiau, Nicolas Heess

In contrast to off-policy algorithms based on Importance Sampling which re-weight data, CF-GPS leverages a model to explicitly consider alternative outcomes, allowing the algorithm to make better use of experience data.

counterfactual

Paper
Add Code

Temporal Difference Variational Auto-Encoder

1 code implementation • ICLR 2019 • Karol Gregor, George Papamakarios, Frederic Besse, Lars Buesing, Theophane Weber

To act and plan in complex environments, we posit that agents should have a mental simulator of the world with three characteristics: (a) it should build an abstract state representing the condition of the world; (b) it should form a belief which represents uncertainty on the world; (c) it should go beyond simple step-by-step simulation, and exhibit temporal abstraction.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Relational recurrent neural networks

2 code implementations • NeurIPS 2018 • Adam Santoro, Ryan Faulkner, David Raposo, Jack Rae, Mike Chrzanowski, Theophane Weber, Daan Wierstra, Oriol Vinyals, Razvan Pascanu, Timothy Lillicrap

Memory-based neural networks model temporal data by leveraging an ability to remember information for long periods.

Ranked #72 on Language Modelling on WikiText-103

Language Modelling Relational Reasoning

246

Paper
Code

Learning and Querying Fast Generative Models for Reinforcement Learning

no code implementations • 8 Feb 2018 • Lars Buesing, Theophane Weber, Sebastien Racaniere, S. M. Ali Eslami, Danilo Rezende, David P. Reichert, Fabio Viola, Frederic Besse, Karol Gregor, Demis Hassabis, Daan Wierstra

A key challenge in model-based reinforcement learning (RL) is to synthesize computationally efficient and accurate environment models.

Atari Games Decision Making +3

Paper
Add Code

Learning Dynamic State Abstractions for Model-Based Reinforcement Learning

no code implementations • ICLR 2018 • Lars Buesing, Theophane Weber, Sebastien Racaniere, S. M. Ali Eslami, Danilo Rezende, David Reichert, Fabio Viola, Frederic Besse, Karol Gregor, Demis Hassabis, Daan Wierstra

A key challenge in model-based reinforcement learning (RL) is to synthesize computationally efficient and accurate environment models.

Atari Games Decision Making +3

Paper
Add Code

Visual Interaction Networks: Learning a Physics Simulator from Video

no code implementations • NeurIPS 2017 • Nicholas Watters, Daniel Zoran, Theophane Weber, Peter Battaglia, Razvan Pascanu, Andrea Tacchetti

We introduce the Visual Interaction Network, a general-purpose model for learning the dynamics of a physical system from raw visual observations.

Decision Making

Paper
Add Code

Visual Interaction Networks

3 code implementations • 5 Jun 2017 • Nicholas Watters, Andrea Tacchetti, Theophane Weber, Razvan Pascanu, Peter Battaglia, Daniel Zoran

We found that from just six input video frames the Visual Interaction Network can generate accurate future trajectories of hundreds of time steps on a wide range of physical systems.

Decision Making

322

Paper
Code

Attend, Infer, Repeat: Fast Scene Understanding with Generative Models

2 code implementations • NeurIPS 2016 • S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, David Szepesvari, Koray Kavukcuoglu, Geoffrey E. Hinton

We present a framework for efficient inference in structured image models that explicitly reason about objects.

Scene Understanding

Paper
Code

Deep Reinforcement Learning in Large Discrete Action Spaces

2 code implementations • 24 Dec 2015 • Gabriel Dulac-Arnold, Richard Evans, Hado van Hasselt, Peter Sunehag, Timothy Lillicrap, Jonathan Hunt, Timothy Mann, Theophane Weber, Thomas Degris, Ben Coppin

Being able to reason in an environment with a large number of discrete actions is essential to bringing reinforcement learning to a larger class of problems.

Recommendation Systems reinforcement-learning +1

Paper
Code

Gradient Estimation Using Stochastic Computation Graphs

1 code implementation • NeurIPS 2015 • John Schulman, Nicolas Heess, Theophane Weber, Pieter Abbeel

In a variety of problems originating in supervised, unsupervised, and reinforcement learning, the loss function is defined by an expectation over a collection of random variables, which might be part of a probabilistic model or the external world.

Paper
Code

Automated Variational Inference in Probabilistic Programming

no code implementations • 7 Jan 2013 • David Wingate, Theophane Weber

We present a new algorithm for approximate inference in probabilistic programs, based on a stochastic gradient for variational programs.

Probabilistic Programming Variational Inference

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.