1 code implementation • 6 Feb 2023 • Tim Genewein, Grégoire Delétang, Anian Ruoss, Li Kevin Wenliang, Elliot Catt, Vincent Dutordoir, Jordi Grau-Moya, Laurent Orseau, Marcus Hutter, Joel Veness
Memory-based meta-learning is a technique for approximating Bayes-optimal predictors.
no code implementations • 30 Sep 2022 • Jordi Grau-Moya, Grégoire Delétang, Markus Kunesch, Tim Genewein, Elliot Catt, Kevin Li, Anian Ruoss, Chris Cundy, Joel Veness, Jane Wang, Marcus Hutter, Christopher Summerfield, Shane Legg, Pedro Ortega
This is in contrast to risk-sensitive agents, which additionally exploit the higher-order moments of the return, and ambiguity-sensitive agents, which act differently when recognizing situations in which they lack knowledge.
1 code implementation • 5 Jul 2022 • Grégoire Delétang, Anian Ruoss, Jordi Grau-Moya, Tim Genewein, Li Kevin Wenliang, Elliot Catt, Chris Cundy, Marcus Hutter, Shane Legg, Joel Veness, Pedro A. Ortega
Reliable generalization lies at the heart of safe ML and AI.
no code implementations • 20 Oct 2021 • Pedro A. Ortega, Markus Kunesch, Grégoire Delétang, Tim Genewein, Jordi Grau-Moya, Joel Veness, Jonas Buchli, Jonas Degrave, Bilal Piot, Julien Perolat, Tom Everitt, Corentin Tallec, Emilio Parisotto, Tom Erez, Yutian Chen, Scott Reed, Marcus Hutter, Nando de Freitas, Shane Legg
The recent phenomenal success of language models has reinvigorated machine learning research, and large sequence models such as transformers are being applied to a variety of domains.
no code implementations • 30 Sep 2021 • Elliot Catt, Marcus Hutter, Joel Veness
In this work we explore and formalize a contrasting view, namely that actions are best thought of as the output of a sequence of internal choices with respect to an action model.
1 code implementation • NeurIPS 2020 • Jianan Wang, Eren Sezener, David Budden, Marcus Hutter, Joel Veness
Our main postulate is that the combination of task segmentation, modular learning and memory-based ensembling can give rise to generalization on an exponentially growing number of unseen tasks.
1 code implementation • NeurIPS 2020 • David Budden, Adam Marblestone, Eren Sezener, Tor Lattimore, Greg Wayne, Joel Veness
We propose the Gaussian Gated Linear Network (G-GLN), an extension to the recently proposed GLN family of deep neural networks.
no code implementations • NeurIPS 2020 • Eren Sezener, Marcus Hutter, David Budden, Jianan Wang, Joel Veness
We introduce a new and completely online contextual bandit algorithm called Gated Linear Contextual Bandits (GLCB).
1 code implementation • 30 Sep 2019 • Joel Veness, Tor Lattimore, David Budden, Avishkar Bhoopchand, Christopher Mattern, Agnieszka Grabska-Barwinska, Eren Sezener, Jianan Wang, Peter Toth, Simon Schmitt, Marcus Hutter
This paper presents a new family of backpropagation-free neural architectures, Gated Linear Networks (GLNs).
no code implementations • 8 May 2019 • Pedro A. Ortega, Jane. X. Wang, Mark Rowland, Tim Genewein, Zeb Kurth-Nelson, Razvan Pascanu, Nicolas Heess, Joel Veness, Alex Pritzel, Pablo Sprechmann, Siddhant M. Jayakumar, Tom McGrath, Kevin Miller, Mohammad Azar, Ian Osband, Neil Rabinowitz, András György, Silvia Chiappa, Simon Osindero, Yee Whye Teh, Hado van Hasselt, Nando de Freitas, Matthew Botvinick, Shane Legg
In this report we review memory-based meta-learning as a tool for building sample-efficient strategies that learn from past experience to adapt to any task within a target class.
no code implementations • 5 Dec 2017 • Joel Veness, Tor Lattimore, Avishkar Bhoopchand, Agnieszka Grabska-Barwinska, Christopher Mattern, Peter Toth
This paper describes a family of probabilistic architectures designed for online learning under the logarithmic loss.
6 code implementations • 18 Sep 2017 • Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew Hausknecht, Michael Bowling
The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games.
20 code implementations • 2 Dec 2016 • James Kirkpatrick, Razvan Pascanu, Neil Rabinowitz, Joel Veness, Guillaume Desjardins, Andrei A. Rusu, Kieran Milan, John Quan, Tiago Ramalho, Agnieszka Grabska-Barwinska, Demis Hassabis, Claudia Clopath, Dharshan Kumaran, Raia Hadsell
The ability to learn tasks in a sequential fashion is crucial to the development of artificial intelligence.
Ranked #3 on
Continual Learning
on F-CelebA (10 tasks)
no code implementations • NeurIPS 2016 • Kieran Milan, Joel Veness, James Kirkpatrick, Michael Bowling, Anna Koop, Demis Hassabis
We introduce the Forget-me-not Process, an efficient, non-parametric meta-algorithm for online probabilistic sequence prediction for piecewise stationary, repeating sources.
5 code implementations • 25 Feb 2015 • Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg1 & Demis Hassabis
We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters.
no code implementations • 19 Nov 2014 • Joel Veness, Marc G. Bellemare, Marcus Hutter, Alvin Chua, Guillaume Desjardins
This paper describes a new information-theoretic policy evaluation technique for reinforcement learning.
no code implementations • 26 Mar 2014 • Joel Veness, Marcus Hutter
This paper revisits the problem of learning a k-CNF Boolean function from examples in the context of online learning under the logarithmic loss.
no code implementations • NeurIPS 2012 • Marc Bellemare, Joel Veness, Michael Bowling
Unfortunately, the typical use of hashing in value function approximation results in biased value estimates due to the possibility of collisions.
3 code implementations • 19 Jul 2012 • Marc G. Bellemare, Yavar Naddaf, Joel Veness, Michael Bowling
We illustrate the promise of ALE by developing and benchmarking domain-independent agents designed using well-established AI techniques for both reinforcement learning and planning.
Ranked #1 on
Atari Games
on Atari 2600 Pooyan
no code implementations • NeurIPS 2011 • Joel Veness, Marc Lanctot, Michael Bowling
Monte-Carlo Tree Search (MCTS) has proven to be a powerful, generic planning technique for decision-making in single-agent and adversarial environments.
1 code implementation • 14 Nov 2011 • Joel Veness, Kee Siong Ng, Marcus Hutter, Michael Bowling
This paper describes the Context Tree Switching technique, a modification of Context Tree Weighting for the prediction of binary, stationary, n-Markov sources.
Information Theory Information Theory
no code implementations • NeurIPS 2010 • David Silver, Joel Veness
Our Monte-Carlo planning algorithm achieved a high level of performance with no prior knowledge, and was also able to exploit simple domain knowledge to achieve better results with less search.
no code implementations • AAAI 2010 2010 • Joel Veness, Kee Siong Ng, Marcus Hutter, David Silver
This paper introduces a principled approach for the design of a scalable general reinforcement learning agent.
no code implementations • NeurIPS 2009 • Joel Veness, David Silver, Alan Blair, William Uther
We implemented our algorithm in a chess program Meep, using a linear heuristic function.
2 code implementations • 4 Sep 2009 • Joel Veness, Kee Siong Ng, Marcus Hutter, William Uther, David Silver
This paper introduces a principled approach for the design of a scalable general reinforcement learning agent.