no code implementations • 16 May 2023 • Adam S. R. Parker, Michael R. Dawson, Patrick M. Pilarski
One challenge identified in these learning methods is that they can forget previously learned predictions when a user begins to successfully act upon delivered feedback.
no code implementations • 28 Dec 2022 • Michael R. Dawson, Adam S. R. Parker, Heather E. Williams, Ahmed W. Shehata, Jacqueline S. Hebert, Craig S. Chapman, Patrick M. Pilarski
Recent advances in upper limb prostheses have led to significant improvements in the number of movements provided by the robotic limb.
no code implementations • 1 Dec 2022 • Nadia M. Ady, Roshan Shariff, Johannes Günther, Patrick M. Pilarski
As a second main contribution of this work, we show how these properties may be implemented together in a proof-of-concept reinforcement learning agent: we demonstrate how the properties manifest in the behaviour of this agent in a simple non-episodic grid-world environment that includes curiosity-inducing locations and induced targets of curiosity.
no code implementations • 14 Oct 2022 • Nathan J. Wispinski, Andrew Butcher, Kory W. Mathewson, Craig S. Chapman, Matthew M. Botvinick, Patrick M. Pilarski
Patch foraging is one of the most heavily studied behavioral optimization challenges in biology.
no code implementations • 23 Aug 2022 • Richard S. Sutton, Michael Bowling, Patrick M. Pilarski
Herein we describe our approach to artificial intelligence research, which we call the Alberta Plan.
no code implementations • 13 Jun 2022 • Alexandra Kearney, Anna Koop, Johannes Günther, Patrick M. Pilarski
In computational reinforcement learning, a growing body of work seeks to construct an agent's perception of the world through predictions of future sensations; predictions about environment observations are used as additional input features to enable better goal-directed decision-making.
no code implementations • 20 May 2022 • Nadia M. Ady, Roshan Shariff, Johannes Günther, Patrick M. Pilarski
Curiosity for machine agents has been a focus of intense research.
no code implementations • 20 Apr 2022 • Kory W. Mathewson, Patrick M. Pilarski
Two major open research questions in the field of IML are: "How should we design systems that can learn to make better decisions over time with human interaction?"
no code implementations • 17 Mar 2022 • Patrick M. Pilarski, Andrew Butcher, Elnaz Davoodi, Michael Bradley Johanson, Dylan J. A. Brenneis, Adam S. R. Parker, Leslie Acker, Matthew M. Botvinick, Joseph Modayil, Adam White
Our results showcase the speed of learning for Pavlovian signalling, the impact that different temporal representations do (and do not) have on agent-agent coordination, and how temporal aliasing impacts agent-agent and human-agent interactions differently.
no code implementations • 11 Jan 2022 • Andrew Butcher, Michael Bradley Johanson, Elnaz Davoodi, Dylan J. A. Brenneis, Leslie Acker, Adam S. R. Parker, Adam White, Joseph Modayil, Patrick M. Pilarski
We further show how to computationally build this adaptive signalling process out of a fixed signalling process, characterized by fast continual prediction learning and minimal constraints on the nature of the agent receiving signals.
no code implementations • 14 Dec 2021 • Dylan J. A. Brenneis, Adam S. Parker, Michael Bradley Johanson, Andrew Butcher, Elnaz Davoodi, Leslie Acker, Matthew M. Botvinick, Joseph Modayil, Adam White, Patrick M. Pilarski
Additionally, we compare two different agent architectures to assess how representational choices in agent design affect the human-agent interaction.
no code implementations • 18 Nov 2021 • Alex Kearney, Anna Koop, Johannes Günther, Patrick M. Pilarski
In computational reinforcement learning, a growing body of work seeks to express an agent's model of the world through predictions about future sensations.
no code implementations • 27 Aug 2020 • Katya Kudashkina, Patrick M. Pilarski, Richard S. Sutton
In this article we argue for the domain of voice document editing and for the methods of model-based reinforcement learning.
Model-based Reinforcement Learning reinforcement-learning +2
no code implementations • 23 Jan 2020 • Alex Kearney, Anna Koop, Patrick M. Pilarski
Constructing general knowledge by learning task-independent models of the world can help agents solve challenging problems.
no code implementations • 18 Nov 2019 • Craig Sherstan, Shibhansh Dohare, James Macglashan, Johannes Günther, Patrick M. Pilarski
By using the timescale as one of the estimator's inputs we can estimate value for arbitrary timescales.
no code implementations • 15 Aug 2019 • Johannes Günther, Nadia M. Ady, Alex Kearney, Michael R. Dawson, Patrick M. Pilarski
Predictions and predictive knowledge have seen recent success in improving not only robot control but also other applications ranging from industrial process control to rehabilitation.
no code implementations • 30 May 2019 • Johannes Günther, Elias Reichensdörfer, Patrick M. Pilarski, Klaus Diepold
In this paper, we examine the utility of extending PID controllers with recurrent neural networks-namely, General Dynamic Neural Networks (GDNN); we show that GDNN (neural) PID controllers perform well on a range of control systems and highlight how they can be a scalable and interpretable option for control systems.
no code implementations • 7 May 2019 • Patrick M. Pilarski, Andrew Butcher, Michael Johanson, Matthew M. Botvinick, Andrew Bolt, Adam S. R. Parker
In this work, we contribute a virtual reality environment wherein a human and an agent can adapt their predictions, their actions, and their communication so as to pursue a simple foraging task.
no code implementations • 18 Apr 2019 • Alex Kearney, Patrick M. Pilarski
While promising, we here suggest that the notion of predictions as knowledge in reinforcement learning is as yet underdeveloped: some work explicitly refers to predictions as knowledge, what the requirements are for considering a prediction to be knowledge have yet to be well explored.
no code implementations • 8 Mar 2019 • Alex Kearney, Vivek Veeriah, Jaden Travnik, Patrick M. Pilarski, Richard S. Sutton
In this paper, we examine an instance of meta-learning in which feature relevance is learned by adapting step size parameters of stochastic gradient descent---building on a variety of prior work in stochastic approximation, machine learning, and artificial neural networks.
no code implementations • 10 Apr 2018 • Alex Kearney, Vivek Veeriah, Jaden B. Travnik, Richard S. Sutton, Patrick M. Pilarski
In this paper, we introduce a method for adapting the step-sizes of temporal difference (TD) learning.
no code implementations • 23 Mar 2018 • Craig Sherstan, Marlos C. Machado, Patrick M. Pilarski
As a primary contribution of this work, we show that using SR-based predictions can improve sample efficiency and learning speed in a continual learning setting where new predictions are incrementally added and learned over time.
no code implementations • 16 Feb 2018 • Jaden B. Travnik, Kory W. Mathewson, Richard S. Sutton, Patrick M. Pilarski
The relationship between a reinforcement learning (RL) agent and an asynchronous environment is often ignored.
no code implementations • 10 Nov 2017 • Patrick M. Pilarski, Richard S. Sutton, Kory W. Mathewson, Craig Sherstan, Adam S. R. Parker, Ann L. Edwards
This work presents an overarching perspective on the role that machine intelligence can play in enhancing human abilities, especially those that have been diminished due to injury or illness.
no code implementations • 3 Mar 2017 • Kory W. Mathewson, Patrick M. Pilarski
In many human-machine interaction settings, there is a growing gap between the degrees-of-freedom of complex semi-autonomous systems and the number of human control channels.
no code implementations • 9 Jan 2017 • Kory W. Mathewson, Patrick M. Pilarski
We illustrate the impact of varying human feedback parameters on task performance by investigating the probability of giving feedback on each time step and the likelihood of given feedback being correct.
no code implementations • 22 Jun 2016 • Kory W. Mathewson, Patrick M. Pilarski
This paper contributes a preliminary report on the advantages and disadvantages of incorporating simultaneous human control and feedback signals in the training of a reinforcement learning robotic agent.
no code implementations • 17 Jun 2016 • Craig Sherstan, Adam White, Marlos C. Machado, Patrick M. Pilarski
Agents of general intelligence deployed in real-world scenarios must adapt to ever-changing environmental conditions.
no code implementations • 9 Jun 2016 • Vivek Veeriah, Patrick M. Pilarski, Richard S. Sutton
The primary objective of the current work is to demonstrate that a learning agent can reduce the amount of explicit feedback required for adapting to the user's preferences pertaining to a task by learning to perceive a value of its behavior from the human user, particularly from the user's facial expressions---we call this face valuing.
1 code implementation • 13 Dec 2015 • Harm van Seijen, A. Rupam Mahmood, Patrick M. Pilarski, Marlos C. Machado, Richard S. Sutton
Our results suggest that the true online methods indeed dominate the regular methods.
no code implementations • 1 Jul 2015 • Harm van Seijen, A. Rupam Mahmood, Patrick M. Pilarski, Richard S. Sutton
Our results confirm the strength of true online TD({\lambda}): 1) for sparse feature vectors, the computational overhead with respect to TD({\lambda}) is minimal; for non-sparse features the computation time is at most twice that of TD({\lambda}), 2) across all domains/representations the learning speed of true online TD({\lambda}) is often better, but never worse than that of TD({\lambda}), and 3) true online TD({\lambda}) is easier to use, because it does not require choosing between trace types, and it is generally more stable with respect to the step-size.
no code implementations • 8 Aug 2014 • Adam S. R. Parker, Ann L. Edwards, Patrick M. Pilarski
Our study therefore contributes initial evidence that prediction learning and machine intelligence can benefit not just control, but also feedback from an artificial limb.
no code implementations • 18 Sep 2013 • Ann L. Edwards, Alexandra Kearney, Michael Rory Dawson, Richard S. Sutton, Patrick M. Pilarski
In the present work, we explore the use of temporal-difference learning and GVFs to predict when users will switch their control influence between the different motor functions of a robot arm.