Search Results for author: Adam White

Found 50 papers, 9 papers with code

Tuning for the Unknown: Revisiting Evaluation Strategies for Lifelong RL

no code implementations • 2 Apr 2024 • Golnaz Mesbahi, Olya Mastikhina, Parham Mohammad Panahi, Martha White, Adam White

In this paper we propose a new approach for tuning and evaluating lifelong RL agents where only one percent of the experiment data can be used for hyperparameter tuning.

Paper
Add Code

Application-Driven Innovation in Machine Learning

no code implementations • 26 Mar 2024 • David Rolnick, Alan Aspuru-Guzik, Sara Beery, Bistra Dilkina, Priya L. Donti, Marzyeh Ghassemi, Hannah Kerner, Claire Monteleoni, Esther Rolf, Milind Tambe, Adam White

As applications of machine learning proliferate, innovative algorithms inspired by specific real-world challenges have become increasingly important.

Paper
Add Code

GVFs in the Real World: Making Predictions Online for Water Treatment

no code implementations • 4 Dec 2023 • Muhammad Kamran Janjua, Haseeb Shah, Martha White, Erfan Miahi, Marlos C. Machado, Adam White

In this paper we investigate the use of reinforcement-learning based prediction approaches for a real drinking-water treatment plant.

Time Series Prediction

Paper
Add Code

Harnessing Discrete Representations For Continual Reinforcement Learning

no code implementations • 2 Dec 2023 • Edan Meyer, Adam White, Marlos C. Machado

In this work, we provide a thorough empirical investigation of the advantages of representing observations as vectors of categorical values within the context of reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Predicting recovery following stroke: deep learning, multimodal data and feature selection using explainable AI

no code implementations • 29 Oct 2023 • Adam White, Margarita Saranti, Artur d'Avila Garcez, Thomas M. H. Hope, Cathy J. Price, Howard Bowman

The highest classification accuracy 0. 854 was observed when 8 regions-of-interest was extracted from each MRI scan and combined with lesion size, initial severity and recovery time in a 2D Residual Neural Network. Our findings demonstrate how imaging and tabular data can be combined for high post-stroke classification accuracy, even when the dataset is small in machine learning terms.

feature selection Stroke Classification

Paper
Add Code

Recurrent Linear Transformers

1 code implementation • 24 Oct 2023 • Subhojeet Pramanik, Esraa Elelimy, Marlos C. Machado, Adam White

In this paper we introduce recurrent alternatives to the transformer self-attention mechanism that offer a context-independent inference cost, leverage long-range dependencies effectively, and perform well in practice.

Paper
Code

Measuring and Mitigating Interference in Reinforcement Learning

no code implementations • 10 Jul 2023 • Vincent Liu, Han Wang, Ruo Yu Tao, Khurram Javed, Adam White, Martha White

Lastly, we outline a class of algorithms which we call online-aware that are designed to mitigate interference, and show they do reduce interference according to our measure and that they improve stability and performance in several classic control environments.

reinforcement-learning

Paper
Add Code

Empirical Design in Reinforcement Learning

no code implementations • 3 Apr 2023 • Andrew Patterson, Samuel Neumann, Martha White, Adam White

The objective of this document is to provide answers on how we can use our unprecedented compute to do good science in reinforcement learning, as well as stay alert to potential pitfalls in our empirical design.

reinforcement-learning

Paper
Add Code

Loss of Plasticity in Continual Deep Reinforcement Learning

no code implementations • 13 Mar 2023 • Zaheer Abbas, Rosie Zhao, Joseph Modayil, Adam White, Marlos C. Machado

The ability to learn continually is essential in a complex and changing world.

Atari Games Continual Learning +2

Paper
Add Code

The In-Sample Softmax for Offline Reinforcement Learning

4 code implementations • 28 Feb 2023 • Chenjun Xiao, Han Wang, Yangchen Pan, Adam White, Martha White

We highlight a simple fact: it is more straightforward to approximate an in-sample \emph{softmax} using only actions in the dataset.

Offline RL reinforcement-learning +1

Paper
Code

Agent-State Construction with Auxiliary Inputs

1 code implementation • 15 Nov 2022 • Ruo Yu Tao, Adam White, Marlos C. Machado

Finally, we show that this approach is complementary to state-of-the-art methods such as recurrent neural networks and truncated back-propagation through time, and acts as a heuristic that facilitates longer temporal credit assignment, leading to better performance.

Decision Making reinforcement-learning +1

Paper
Code

Auxiliary task discovery through generate-and-test

no code implementations • 25 Oct 2022 • Banafsheh Rafiee, Sina Ghiassian, Jun Jin, Richard Sutton, Jun Luo, Adam White

In this paper, we explore an approach to auxiliary task discovery in reinforcement learning based on ideas from representation learning.

Meta-Learning Representation Learning

Paper
Add Code

Goal-Space Planning with Subgoal Models

no code implementations • 6 Jun 2022 • Chunlok Lo, Kevin Roice, Parham Mohammad Panahi, Scott Jordan, Adam White, Gabor Mihucz, Farzane Aminmansour, Martha White

In this paper, we avoid this limitation by constraining background planning to a set of (abstract) subgoals and learning only local, subgoal-conditioned models.

Model-based Reinforcement Learning Reinforcement Learning (RL)

Paper
Add Code

No More Pesky Hyperparameters: Offline Hyperparameter Tuning for RL

no code implementations • 18 May 2022 • Han Wang, Archit Sakhadeo, Adam White, James Bell, Vincent Liu, Xutong Zhao, Puer Liu, Tadashi Kozuno, Alona Fyshe, Martha White

The performance of reinforcement learning (RL) agents is sensitive to the choice of hyperparameters.

Reinforcement Learning (RL)

Paper
Add Code

What makes useful auxiliary tasks in reinforcement learning: investigating the effect of the target policy

no code implementations • 1 Apr 2022 • Banafsheh Rafiee, Jun Jin, Jun Luo, Adam White

Our focus on the role of the target policy of the auxiliary tasks is motivated by the fact that the target policy determines the behavior about which the agent wants to make a prediction and the state-action distribution that the agent is trained on, which further affects the main task learning.

Representation Learning

Paper
Add Code

Investigating the Properties of Neural Network Representations in Reinforcement Learning

no code implementations • 30 Mar 2022 • Han Wang, Erfan Miahi, Martha White, Marlos C. Machado, Zaheer Abbas, Raksha Kumaraswamy, Vincent Liu, Adam White

In this paper we investigate the properties of representations learned by deep reinforcement learning systems.

Q-Learning reinforcement-learning +2

Paper
Add Code

The Frost Hollow Experiments: Pavlovian Signalling as a Path to Coordination and Communication Between Agents

no code implementations • 17 Mar 2022 • Patrick M. Pilarski, Andrew Butcher, Elnaz Davoodi, Michael Bradley Johanson, Dylan J. A. Brenneis, Adam S. R. Parker, Leslie Acker, Matthew M. Botvinick, Joseph Modayil, Adam White

Our results showcase the speed of learning for Pavlovian signalling, the impact that different temporal representations do (and do not) have on agent-agent coordination, and how temporal aliasing impacts agent-agent and human-agent interactions differently.

Decision Making reinforcement-learning +1

Paper
Add Code

Continual Auxiliary Task Learning

no code implementations • NeurIPS 2021 • Matthew McLeod, Chunlok Lo, Matthew Schlegel, Andrew Jacobsen, Raksha Kumaraswamy, Martha White, Adam White

Learning auxiliary tasks, such as multiple predictions about the world, can provide many benefits to reinforcement learning systems.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Reward-Respecting Subtasks for Model-Based Reinforcement Learning

no code implementations • 7 Feb 2022 • Richard S. Sutton, Marlos C. Machado, G. Zacharias Holland, David Szepesvari, Finbarr Timbers, Brian Tanner, Adam White

Each subtask is solved to produce an option, and then a model of the option is learned and made available to the planning process.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Pavlovian Signalling with General Value Functions in Agent-Agent Temporal Decision Making

no code implementations • 11 Jan 2022 • Andrew Butcher, Michael Bradley Johanson, Elnaz Davoodi, Dylan J. A. Brenneis, Leslie Acker, Adam S. R. Parker, Adam White, Joseph Modayil, Patrick M. Pilarski

We further show how to computationally build this adaptive signalling process out of a fixed signalling process, characterized by fast continual prediction learning and minimal constraints on the nature of the agent receiving signals.

Decision Making reinforcement-learning +1

Paper
Add Code

Assessing Human Interaction in Virtual Reality With Continually Learning Prediction Agents Based on Reinforcement Learning Algorithms: A Pilot Study

no code implementations • 14 Dec 2021 • Dylan J. A. Brenneis, Adam S. Parker, Michael Bradley Johanson, Andrew Butcher, Elnaz Davoodi, Leslie Acker, Matthew M. Botvinick, Joseph Modayil, Adam White, Patrick M. Pilarski

Additionally, we compare two different agent architectures to assess how representational choices in agent design affect the human-agent interaction.

Continual Learning Reinforcement Learning (RL)

Paper
Add Code

Counterfactual Instances Explain Little

no code implementations • 20 Sep 2021 • Adam White, Artur d'Avila Garcez

We will further illustrate how explainable AI methods that provide both causal equations and counterfactual instances can successfully explain machine learning predictions.

BIG-bench Machine Learning counterfactual +1

Paper
Add Code

Learning Expected Emphatic Traces for Deep RL

no code implementations • 12 Jul 2021 • Ray Jiang, Shangtong Zhang, Veronica Chelu, Adam White, Hado van Hasselt

We develop a multi-step emphatic weighting that can be combined with replay, and a time-reversed $n$-step TD learning algorithm to learn the required emphatic weighting.

Paper
Add Code

Contrastive Counterfactual Visual Explanations With Overdetermination

1 code implementation • 28 Jun 2021 • Adam White, Kwun Ho Ngan, James Phelan, Saman Sadeghi Afgeh, Kevin Ryan, Constantino Carlos Reyes-Aldasoro, Artur d'Avila Garcez

A novel explainable AI method called CLEAR Image is introduced in this paper.

counterfactual

Paper
Code

Emphatic Algorithms for Deep Reinforcement Learning

no code implementations • 21 Jun 2021 • Ray Jiang, Tom Zahavy, Zhongwen Xu, Adam White, Matteo Hessel, Charles Blundell, Hado van Hasselt

In this paper, we extend the use of emphatic methods to deep reinforcement learning agents.

Atari Games reinforcement-learning +1

Paper
Add Code

A Generalized Projected Bellman Error for Off-policy Value Estimation in Reinforcement Learning

no code implementations • 28 Apr 2021 • Andrew Patterson, Adam White, Martha White

Many algorithms have been developed for off-policy value estimation based on the linear mean squared projected Bellman error (MSPBE) and are sound under linear function approximation.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

From Eye-blinks to State Construction: Diagnostic Benchmarks for Online Representation Learning

1 code implementation • 9 Nov 2020 • Banafsheh Rafiee, Zaheer Abbas, Sina Ghiassian, Raksha Kumaraswamy, Richard Sutton, Elliot Ludvig, Adam White

We present three new diagnostic prediction problems inspired by classical-conditioning experiments to facilitate research in online prediction learning.

Continual Learning Representation Learning

Paper
Code

Towards a practical measure of interference for reinforcement learning

no code implementations • 7 Jul 2020 • Vincent Liu, Adam White, Hengshuai Yao, Martha White

In this work, we provide a definition of interference for control in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Gradient Temporal-Difference Learning with Regularized Corrections

1 code implementation • ICML 2020 • Sina Ghiassian, Andrew Patterson, Shivam Garg, Dhawal Gupta, Adam White, Martha White

It is still common to use Q-learning and temporal difference (TD) learning-even though they have divergence issues and sound Gradient TD alternatives exist-because divergence seems rare and they typically perform well.

Q-Learning

Paper
Code

Training Recurrent Neural Networks Online by Learning Explicit State Variables

no code implementations • ICLR 2020 • Somjit Nath, Vincent Liu, Alan Chan, Xin Li, Adam White, Martha White

Recurrent neural networks (RNNs) allow an agent to construct a state-representation from a stream of experience, which is essential in partially observable problems.

Paper
Add Code

Improving Performance in Reinforcement Learning by Breaking Generalization in Neural Networks

no code implementations • 16 Mar 2020 • Sina Ghiassian, Banafsheh Rafiee, Yat Long Lo, Adam White

Unfortunately, the performance of deep reinforcement learning systems is sensitive to hyper-parameter settings and architecture choices.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Measurable Counterfactual Local Explanations for Any Classifier

no code implementations • 8 Aug 2019 • Adam White, Artur d'Avila Garcez

We propose a novel method for explaining the predictions of any classifier.

counterfactual regression

Paper
Add Code

Meta-descent for Online, Continual Prediction

no code implementations • 17 Jul 2019 • Andrew Jacobsen, Matthew Schlegel, Cameron Linke, Thomas Degris, Adam White, Martha White

This paper investigates different vector step-size adaptation approaches for non-stationary online, continual prediction problems.

Second-order methods Time Series +1

Paper
Add Code

Adapting Behaviour via Intrinsic Reward: A Survey and Empirical Study

no code implementations • 19 Jun 2019 • Cam Linke, Nadia M. Ady, Martha White, Thomas Degris, Adam White

The question we tackle in this paper is how to sculpt the stream of experience---how to adapt the learning system's behavior---to optimize the learning of a collection of value functions.

Active Learning reinforcement-learning +2

Paper
Add Code

Planning with Expectation Models

no code implementations • 2 Apr 2019 • Yi Wan, Zaheer Abbas, Adam White, Martha White, Richard S. Sutton

In particular, we 1) show that planning with an expectation model is equivalent to planning with a distribution model if the state value function is linear in state features, 2) analyze two common parametrization choices for approximating the expectation: linear and non-linear expectation models, 3) propose a sound model-based policy evaluation algorithm and present its convergence results, and 4) empirically demonstrate the effectiveness of the proposed planning algorithm.

Model-based Reinforcement Learning

Paper
Add Code

The Barbados 2018 List of Open Issues in Continual Learning

no code implementations • 16 Nov 2018 • Tom Schaul, Hado van Hasselt, Joseph Modayil, Martha White, Adam White, Pierre-Luc Bacon, Jean Harb, Shibl Mourad, Marc Bellemare, Doina Precup

We want to make progress toward artificial general intelligence, namely general-purpose agents that autonomously learn how to competently act in complex environments.

Continual Learning

Paper
Add Code

Context-Dependent Upper-Confidence Bounds for Directed Exploration

no code implementations • NeurIPS 2018 • Raksha Kumaraswamy, Matthew Schlegel, Adam White, Martha White

Directed exploration strategies for reinforcement learning are critical for learning an optimal policy in a minimal number of interactions with the environment.

Efficient Exploration

Paper
Add Code

Online Off-policy Prediction

no code implementations • 6 Nov 2018 • Sina Ghiassian, Andrew Patterson, Martha White, Richard S. Sutton, Adam White

The ability to learn behavior-contingent predictions online and off-policy has long been advocated as a key capability of predictive-knowledge learning systems but remained an open algorithmic challenge for decades.

Paper
Add Code

Greedy Actor-Critic: A New Conditional Cross-Entropy Method for Policy Improvement

1 code implementation • 22 Oct 2018 • Samuel Neumann, Sungsu Lim, Ajin Joseph, Yangchen Pan, Adam White, Martha White

We first provide a policy improvement result in an idealized setting, and then prove that our conditional CEM (CCEM) strategy tracks a CEM update per state, even with changing action-values.

Policy Gradient Methods Q-Learning

Paper
Code

General Value Function Networks

no code implementations • 18 Jul 2018 • Matthew Schlegel, Andrew Jacobsen, Zaheer Abbas, Andrew Patterson, Adam White, Martha White

A general purpose strategy for state construction is to learn the state update using a Recurrent Neural Network (RNN), which updates the internal state using the current internal state and the most recent observation.

Continuous Control Decision Making

Paper
Add Code

Organizing Experience: A Deeper Look at Replay Mechanisms for Sample-based Planning in Continuous State Domains

no code implementations • 12 Jun 2018 • Yangchen Pan, Muhammad Zaheer, Adam White, Andrew Patterson, Martha White

We show that a model, as opposed to a replay buffer, is particularly useful for specifying which states to sample from during planning, such as predecessor states that propagate information in reverse from a state more quickly.

Paper
Add Code

Directly Estimating the Variance of the λ-Return Using Temporal-Difference Methods

no code implementations • 25 Jan 2018 • Craig Sherstan, Brendan Bennett, Kenny Young, Dylan R. Ashley, Adam White, Martha White, Richard S. Sutton

This paper investigates estimating the variance of a temporal-difference learning agent's update target.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Discovery of Predictive Representations With a Network of General Value Functions

no code implementations • ICLR 2018 • Matthew Schlegel, Andrew Patterson, Adam White, Martha White

We investigate a framework for discovery: curating a large collection of predictions, which are used to construct the agent's representation of the world.

Decision Making

Paper
Add Code

GQ($λ$) Quick Reference and Implementation Guide

no code implementations • 10 May 2017 • Adam White, Richard S. Sutton

This document should serve as a quick reference for and guide to the implementation of linear GQ($\lambda$), a gradient-based off-policy temporal-difference learning algorithm.

Paper
Add Code

Accelerated Gradient Temporal Difference Learning

no code implementations • 28 Nov 2016 • Yangchen Pan, Adam White, Martha White

The family of temporal difference (TD) methods span a spectrum from computationally frugal linear methods like TD({\lambda}) to data efficient least squares methods.

Paper
Add Code

A Greedy Approach to Adapting the Trace Parameter for Temporal Difference Learning

2 code implementations • 2 Jul 2016 • Martha White, Adam White

One of the main obstacles to broad application of reinforcement learning methods is the parameter sensitivity of our core learning algorithms.

Meta-Learning reinforcement-learning +1

Paper
Code

Introspective Agents: Confidence Measures for General Value Functions

no code implementations • 17 Jun 2016 • Craig Sherstan, Adam White, Marlos C. Machado, Patrick M. Pilarski

Agents of general intelligence deployed in real-world scenarios must adapt to ever-changing environmental conditions.

Position

Paper
Add Code

Investigating practical linear temporal difference learning

1 code implementation • 28 Feb 2016 • Adam White, Martha White

First, we derive two new hybrid TD policy-evaluation algorithms, which fill a gap in this collection of algorithms.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Multi-timescale Nexting in a Reinforcement Learning Robot

no code implementations • 6 Dec 2011 • Joseph Modayil, Adam White, Richard S. Sutton

The term "nexting" has been used by psychologists to refer to the propensity of people and many other animals to continually predict what will happen next in an immediate, local, and personal sense.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Interval Estimation for Reinforcement-Learning Algorithms in Continuous-State Domains

no code implementations • NeurIPS 2010 • Martha White, Adam White

The reinforcement learning community has explored many approaches to obtain- ing value estimates and models to guide decision making; these approaches, how- ever, do not usually provide a measure of confidence in the estimate.

Decision Making reinforcement-learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.