Search Results for author: Tom Schaul

Found 40 papers, 16 papers with code

Plasticity as the Mirror of Empowerment

no code implementations15 May 2025 David Abel, Michael Bowling, André Barreto, Will Dabney, Shi Dong, Steven Hansen, Anna Harutyunyan, Khimya Khetarpal, Clare Lyle, Razvan Pascanu, Georgios Piliouras, Doina Precup, Jonathan Richens, Mark Rowland, Tom Schaul, Satinder Singh

In this paper, we ground this concept in a universal agent-centric measure that we refer to as plasticity, and reveal a fundamental connection to empowerment.

AuPair: Golden Example Pairs for Code Repair

no code implementations12 Feb 2025 Aditi Mavalankar, Hassan Mansoor, Zita Marinho, Masha Samsikova, Tom Schaul

Our algorithm yields a significant boost in performance compared to best-of-$N$ and self-repair, and also exhibits strong generalisation across datasets and models.

Code Repair In-Context Learning

Agency Is Frame-Dependent

no code implementations6 Feb 2025 David Abel, André Barreto, Michael Bowling, Will Dabney, Shi Dong, Steven Hansen, Anna Harutyunyan, Khimya Khetarpal, Clare Lyle, Razvan Pascanu, Georgios Piliouras, Doina Precup, Jonathan Richens, Mark Rowland, Tom Schaul, Satinder Singh

Agency is a system's capacity to steer outcomes toward a goal, and is a central topic of study across biology, philosophy, cognitive science, and artificial intelligence.

Philosophy reinforcement-learning +1

Boundless Socratic Learning with Language Games

1 code implementation25 Nov 2024 Tom Schaul

An agent trained within a closed system can master any desired capability, as long as the following three conditions hold: (a) it receives sufficiently informative and aligned feedback, (b) its coverage of experience/data is broad enough, and (c) it has sufficient capacity and resource.

Open-Endedness is Essential for Artificial Superhuman Intelligence

no code implementations6 Jun 2024 Edward Hughes, Michael Dennis, Jack Parker-Holder, Feryal Behbahani, Aditi Mavalankar, Yuge Shi, Tom Schaul, Tim Rocktaschel

In this position paper, we argue that the ingredients are now in place to achieve openendedness in AI systems with respect to a human observer.

Discovering Attention-Based Genetic Algorithms via Meta-Black-Box Optimization

1 code implementation8 Apr 2023 Robert Tjarko Lange, Tom Schaul, Yutian Chen, Chris Lu, Tom Zahavy, Valentin Dalibard, Sebastian Flennerhag

Genetic algorithms constitute a family of black-box optimization algorithms, which take inspiration from the principles of biological evolution.

Scaling Goal-based Exploration via Pruning Proto-goals

1 code implementation9 Feb 2023 Akhil Bagaria, Ray Jiang, Ramana Kumar, Tom Schaul

One of the gnarliest challenges in reinforcement learning (RL) is exploration that scales to vast domains, where novelty-, or coverage-seeking behaviour falls short.

reinforcement-learning Reinforcement Learning (RL)

The Phenomenon of Policy Churn

no code implementations1 Jun 2022 Tom Schaul, André Barreto, John Quan, Georg Ostrovski

We identify and study the phenomenon of policy churn, that is, the rapid change of the greedy policy in value-based reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Model-Value Inconsistency as a Signal for Epistemic Uncertainty

no code implementations8 Dec 2021 Angelos Filos, Eszter Vértes, Zita Marinho, Gregory Farquhar, Diana Borsa, Abram Friesen, Feryal Behbahani, Tom Schaul, André Barreto, Simon Osindero

Unlike prior work which estimates uncertainty by training an ensemble of many models and/or value functions, this approach requires only the single model and value function which are already being learned in most model-based reinforcement learning algorithms.

Model-based Reinforcement Learning Rolling Shutter Correction

Policy Evaluation Networks

no code implementations26 Feb 2020 Jean Harb, Tom Schaul, Doina Precup, Pierre-Luc Bacon

The core idea of this paper is to flip this convention and estimate the value of many policies, for a single set of states.

Reinforcement Learning

Adapting Behaviour for Learning Progress

no code implementations14 Dec 2019 Tom Schaul, Diana Borsa, David Ding, David Szepesvari, Georg Ostrovski, Will Dabney, Simon Osindero

Determining what experience to generate to best facilitate learning (i. e. exploration) is one of the distinguishing features and open challenges in reinforcement learning.

Atari Games Reinforcement Learning

Non-Differentiable Supervised Learning with Evolution Strategies and Hybrid Methods

no code implementations7 Jun 2019 Karel Lenc, Erich Elsen, Tom Schaul, Karen Simonyan

While using ES for differentiable parameters is computationally impractical (although possible), we show that a hybrid approach is practically feasible in the case where the model has both differentiable and non-differentiable parameters.

text-to-speech Text to Speech

The Barbados 2018 List of Open Issues in Continual Learning

no code implementations16 Nov 2018 Tom Schaul, Hado van Hasselt, Joseph Modayil, Martha White, Adam White, Pierre-Luc Bacon, Jean Harb, Shibl Mourad, Marc Bellemare, Doina Precup

We want to make progress toward artificial general intelligence, namely general-purpose agents that autonomously learn how to competently act in complex environments.

Continual Learning

Meta-Learning by the Baldwin Effect

no code implementations6 Jun 2018 Chrisantha Thomas Fernando, Jakub Sygnowski, Simon Osindero, Jane Wang, Tom Schaul, Denis Teplyashin, Pablo Sprechmann, Alexander Pritzel, Andrei A. Rusu

The scope of the Baldwin effect was recently called into question by two papers that closely examined the seminal work of Hinton and Nowlan.

Meta-Learning Reinforcement Learning

Deep Q-learning from Demonstrations

5 code implementations12 Apr 2017 Todd Hester, Matej Vecerik, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Gabriel Dulac-Arnold, Ian Osband, John Agapiou, Joel Z. Leibo, Audrunas Gruslys

We present an algorithm, Deep Q-learning from Demonstrations (DQfD), that leverages small sets of demonstration data to massively accelerate the learning process even from relatively small amounts of demonstration data and is able to automatically assess the necessary ratio of demonstration data while learning thanks to a prioritized replay mechanism.

Deep Reinforcement Learning Imitation Learning +2

Reinforcement Learning with Unsupervised Auxiliary Tasks

3 code implementations16 Nov 2016 Max Jaderberg, Volodymyr Mnih, Wojciech Marian Czarnecki, Tom Schaul, Joel Z. Leibo, David Silver, Koray Kavukcuoglu

We also introduce a novel mechanism for focusing this representation upon extrinsic rewards, so that learning can rapidly adapt to the most relevant aspects of the actual task.

Deep Reinforcement Learning reinforcement-learning +1

Prioritized Experience Replay

77 code implementations18 Nov 2015 Tom Schaul, John Quan, Ioannis Antonoglou, David Silver

Experience replay lets online reinforcement learning agents remember and reuse experiences from the past.

Atari Games reinforcement-learning +2

Unit Tests for Stochastic Optimization

no code implementations20 Dec 2013 Tom Schaul, Ioannis Antonoglou, David Silver

Optimization by stochastic gradient descent is an important component of many large-scale machine learning algorithms.

Stochastic Optimization

Adaptive learning rates and parallelization for stochastic, sparse, non-smooth gradients

no code implementations16 Jan 2013 Tom Schaul, Yann Lecun

Recent work has established an empirically successful framework for adapting learning rates for stochastic gradient descent (SGD).

No More Pesky Learning Rates

no code implementations6 Jun 2012 Tom Schaul, Sixin Zhang, Yann Lecun

The performance of stochastic gradient descent (SGD) depends critically on how learning rates are tuned and decreased over time.

Measuring Intelligence through Games

no code implementations6 Sep 2011 Tom Schaul, Julian Togelius, Jürgen Schmidhuber

Artificial general intelligence (AGI) refers to research aimed at tackling the full problem of artificial intelligence, that is, create truly intelligent agents.

Motion Planning

Natural Evolution Strategies

1 code implementation22 Jun 2011 Daan Wierstra, Tom Schaul, Tobias Glasmachers, Yi Sun, Jürgen Schmidhuber

This paper presents Natural Evolution Strategies (NES), a recent family of algorithms that constitute a more principled approach to black-box optimization than established evolutionary algorithms.

Evolutionary Algorithms global-optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.