Search Results for author: Matthew J. A. Smith

Found 3 papers, 0 papers with code

Why Target Networks Stabilise Temporal Difference Methods

no code implementations • 24 Feb 2023 • Mattie Fellows, Matthew J. A. Smith, Shimon Whiteson

Integral to recent successes in deep reinforcement learning has been a class of temporal difference methods that use infrequently updated target values for policy evaluation in a Markov Decision Process.

Paper
Add Code

Stability and Generalisation in Batch Reinforcement Learning

no code implementations • 29 Sep 2021 • Matthew J. A. Smith, Shimon Whiteson

Overfitting has been recently acknowledged as a key limiting factor in the capabilities of reinforcement learning algorithms, despite little theoretical characterisation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

An inference-based policy gradient method for learning options

no code implementations • ICLR 2018 • Matthew J. A. Smith, Herke van Hoof, Joelle Pineau

In this work we develop a novel policy gradient method for the automatic learning of policies with options.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.