Search Results for author: Joel Veness

Found 28 papers, 14 papers with code

A Monte Carlo AIXI Approximation

2 code implementations4 Sep 2009 Joel Veness, Kee Siong Ng, Marcus Hutter, William Uther, David Silver

This paper introduces a principled approach for the design of a scalable general reinforcement learning agent.

General Reinforcement Learning Open-Ended Question Answering +2

Bootstrapping from Game Tree Search

no code implementations NeurIPS 2009 Joel Veness, David Silver, Alan Blair, William Uther

We implemented our algorithm in a chess program Meep, using a linear heuristic function.

Monte-Carlo Planning in Large POMDPs

no code implementations NeurIPS 2010 David Silver, Joel Veness

Our Monte-Carlo planning algorithm achieved a high level of performance with no prior knowledge, and was also able to exploit simple domain knowledge to achieve better results with less search.

Context Tree Switching

1 code implementation14 Nov 2011 Joel Veness, Kee Siong Ng, Marcus Hutter, Michael Bowling

This paper describes the Context Tree Switching technique, a modification of Context Tree Weighting for the prediction of binary, stationary, n-Markov sources.

Information Theory Information Theory

Variance Reduction in Monte-Carlo Tree Search

no code implementations NeurIPS 2011 Joel Veness, Marc Lanctot, Michael Bowling

Monte-Carlo Tree Search (MCTS) has proven to be a powerful, generic planning technique for decision-making in single-agent and adversarial environments.

Decision Making

The Arcade Learning Environment: An Evaluation Platform for General Agents

3 code implementations19 Jul 2012 Marc G. Bellemare, Yavar Naddaf, Joel Veness, Michael Bowling

We illustrate the promise of ALE by developing and benchmarking domain-independent agents designed using well-established AI techniques for both reinforcement learning and planning.

Atari Games Benchmarking +4

Sketch-Based Linear Value Function Approximation

no code implementations NeurIPS 2012 Marc Bellemare, Joel Veness, Michael Bowling

Unfortunately, the typical use of hashing in value function approximation results in biased value estimates due to the possibility of collisions.

Atari Games reinforcement-learning +1

Online Learning of k-CNF Boolean Functions

no code implementations26 Mar 2014 Joel Veness, Marcus Hutter

This paper revisits the problem of learning a k-CNF Boolean function from examples in the context of online learning under the logarithmic loss.

PAC learning

Compress and Control

no code implementations19 Nov 2014 Joel Veness, Marc G. Bellemare, Marcus Hutter, Alvin Chua, Guillaume Desjardins

This paper describes a new information-theoretic policy evaluation technique for reinforcement learning.

Reinforcement Learning (RL)

Human level control through deep reinforcement learning

7 code implementations25 Feb 2015 Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg1 & Demis Hassabis

We demonstrate that the deep Q-network agent, receiving only the pixels and the game score as inputs, was able to surpass the performance of all previous algorithms and achieve a level comparable to that of a professional human games tester across a set of 49 games, using the same algorithm, network architecture and hyperparameters.

Atari Games reinforcement-learning +1

The Forget-me-not Process

no code implementations NeurIPS 2016 Kieran Milan, Joel Veness, James Kirkpatrick, Michael Bowling, Anna Koop, Demis Hassabis

We introduce the Forget-me-not Process, an efficient, non-parametric meta-algorithm for online probabilistic sequence prediction for piecewise stationary, repeating sources.

Revisiting the Arcade Learning Environment: Evaluation Protocols and Open Problems for General Agents

7 code implementations18 Sep 2017 Marlos C. Machado, Marc G. Bellemare, Erik Talvitie, Joel Veness, Matthew Hausknecht, Michael Bowling

The Arcade Learning Environment (ALE) is an evaluation platform that poses the challenge of building AI agents with general competency across dozens of Atari 2600 games.

Atari Games

Online Learning with Gated Linear Networks

no code implementations5 Dec 2017 Joel Veness, Tor Lattimore, Avishkar Bhoopchand, Agnieszka Grabska-Barwinska, Christopher Mattern, Peter Toth

This paper describes a family of probabilistic architectures designed for online learning under the logarithmic loss.

Online Learning in Contextual Bandits using Gated Linear Networks

no code implementations NeurIPS 2020 Eren Sezener, Marcus Hutter, David Budden, Jianan Wang, Joel Veness

We introduce a new and completely online contextual bandit algorithm called Gated Linear Contextual Bandits (GLCB).

Multi-Armed Bandits

Gaussian Gated Linear Networks

2 code implementations NeurIPS 2020 David Budden, Adam Marblestone, Eren Sezener, Tor Lattimore, Greg Wayne, Joel Veness

We propose the Gaussian Gated Linear Network (G-GLN), an extension to the recently proposed GLN family of deep neural networks.

Denoising Density Estimation +2

A Combinatorial Perspective on Transfer Learning

1 code implementation NeurIPS 2020 Jianan Wang, Eren Sezener, David Budden, Marcus Hutter, Joel Veness

Our main postulate is that the combination of task segmentation, modular learning and memory-based ensembling can give rise to generalization on an exponentially growing number of unseen tasks.

Continual Learning Transfer Learning

Reinforcement Learning with Information-Theoretic Actuation

no code implementations30 Sep 2021 Elliot Catt, Marcus Hutter, Joel Veness

In this work we explore and formalize a contrasting view, namely that actions are best thought of as the output of a sequence of internal choices with respect to an action model.

reinforcement-learning Reinforcement Learning (RL)

Beyond Bayes-optimality: meta-learning what you know you don't know

no code implementations30 Sep 2022 Jordi Grau-Moya, Grégoire Delétang, Markus Kunesch, Tim Genewein, Elliot Catt, Kevin Li, Anian Ruoss, Chris Cundy, Joel Veness, Jane Wang, Marcus Hutter, Christopher Summerfield, Shane Legg, Pedro Ortega

This is in contrast to risk-sensitive agents, which additionally exploit the higher-order moments of the return, and ambiguity-sensitive agents, which act differently when recognizing situations in which they lack knowledge.

Decision Making Meta-Learning

Language Modeling Is Compression

1 code implementation19 Sep 2023 Grégoire Delétang, Anian Ruoss, Paul-Ambroise Duquenne, Elliot Catt, Tim Genewein, Christopher Mattern, Jordi Grau-Moya, Li Kevin Wenliang, Matthew Aitchison, Laurent Orseau, Marcus Hutter, Joel Veness

We show that large language models are powerful general-purpose predictors and that the compression viewpoint provides novel insights into scaling laws, tokenization, and in-context learning.

In-Context Learning Language Modelling

Learning Universal Predictors

1 code implementation26 Jan 2024 Jordi Grau-Moya, Tim Genewein, Marcus Hutter, Laurent Orseau, Grégoire Delétang, Elliot Catt, Anian Ruoss, Li Kevin Wenliang, Christopher Mattern, Matthew Aitchison, Joel Veness

Meta-learning has emerged as a powerful approach to train neural networks to learn new tasks quickly from limited data.

Meta-Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.