Search Results for author: Patrick M. Pilarski

Found 33 papers, 1 papers with code

Continually Learned Pavlovian Signalling Without Forgetting for Human-in-the-Loop Robotic Control

no code implementations16 May 2023 Adam S. R. Parker, Michael R. Dawson, Patrick M. Pilarski

One challenge identified in these learning methods is that they can forget previously learned predictions when a user begins to successfully act upon delivered feedback.

Five Properties of Specific Curiosity You Didn't Know Curious Machines Should Have

no code implementations1 Dec 2022 Nadia M. Ady, Roshan Shariff, Johannes Günther, Patrick M. Pilarski

As a second main contribution of this work, we show how these properties may be implemented together in a proof-of-concept reinforcement learning agent: we demonstrate how the properties manifest in the behaviour of this agent in a simple non-episodic grid-world environment that includes curiosity-inducing locations and induced targets of curiosity.

Decision Making reinforcement-learning +1

The Alberta Plan for AI Research

no code implementations23 Aug 2022 Richard S. Sutton, Michael Bowling, Patrick M. Pilarski

Herein we describe our approach to artificial intelligence research, which we call the Alberta Plan.

What Should I Know? Using Meta-gradient Descent for Predictive Feature Discovery in a Single Stream of Experience

no code implementations13 Jun 2022 Alexandra Kearney, Anna Koop, Johannes Günther, Patrick M. Pilarski

In computational reinforcement learning, a growing body of work seeks to construct an agent's perception of the world through predictions of future sensations; predictions about environment observations are used as additional input features to enable better goal-directed decision-making.

Continual Learning Decision Making

A Brief Guide to Designing and Evaluating Human-Centered Interactive Machine Learning

no code implementations20 Apr 2022 Kory W. Mathewson, Patrick M. Pilarski

Two major open research questions in the field of IML are: "How should we design systems that can learn to make better decisions over time with human interaction?"

BIG-bench Machine Learning Decision Making +1

The Frost Hollow Experiments: Pavlovian Signalling as a Path to Coordination and Communication Between Agents

no code implementations17 Mar 2022 Patrick M. Pilarski, Andrew Butcher, Elnaz Davoodi, Michael Bradley Johanson, Dylan J. A. Brenneis, Adam S. R. Parker, Leslie Acker, Matthew M. Botvinick, Joseph Modayil, Adam White

Our results showcase the speed of learning for Pavlovian signalling, the impact that different temporal representations do (and do not) have on agent-agent coordination, and how temporal aliasing impacts agent-agent and human-agent interactions differently.

Decision Making reinforcement-learning +1

Pavlovian Signalling with General Value Functions in Agent-Agent Temporal Decision Making

no code implementations11 Jan 2022 Andrew Butcher, Michael Bradley Johanson, Elnaz Davoodi, Dylan J. A. Brenneis, Leslie Acker, Adam S. R. Parker, Adam White, Joseph Modayil, Patrick M. Pilarski

We further show how to computationally build this adaptive signalling process out of a fixed signalling process, characterized by fast continual prediction learning and minimal constraints on the nature of the agent receiving signals.

Decision Making reinforcement-learning +1

Finding Useful Predictions by Meta-gradient Descent to Improve Decision-making

no code implementations18 Nov 2021 Alex Kearney, Anna Koop, Johannes Günther, Patrick M. Pilarski

In computational reinforcement learning, a growing body of work seeks to express an agent's model of the world through predictions about future sensations.

Decision Making

What's a Good Prediction? Challenges in evaluating an agent's knowledge

no code implementations23 Jan 2020 Alex Kearney, Anna Koop, Patrick M. Pilarski

Constructing general knowledge by learning task-independent models of the world can help agents solve challenging problems.

Continual Learning General Knowledge

Examining the Use of Temporal-Difference Incremental Delta-Bar-Delta for Real-World Predictive Knowledge Architectures

no code implementations15 Aug 2019 Johannes Günther, Nadia M. Ady, Alex Kearney, Michael R. Dawson, Patrick M. Pilarski

Predictions and predictive knowledge have seen recent success in improving not only robot control but also other applications ranging from industrial process control to rehabilitation.

Representation Learning

Interpretable PID Parameter Tuning for Control Engineering using General Dynamic Neural Networks: An Extensive Comparison

no code implementations30 May 2019 Johannes Günther, Elias Reichensdörfer, Patrick M. Pilarski, Klaus Diepold

In this paper, we examine the utility of extending PID controllers with recurrent neural networks-namely, General Dynamic Neural Networks (GDNN); we show that GDNN (neural) PID controllers perform well on a range of control systems and highlight how they can be a scalable and interpretable option for control systems.

Learned human-agent decision-making, communication and joint action in a virtual reality environment

no code implementations7 May 2019 Patrick M. Pilarski, Andrew Butcher, Michael Johanson, Matthew M. Botvinick, Andrew Bolt, Adam S. R. Parker

In this work, we contribute a virtual reality environment wherein a human and an agent can adapt their predictions, their actions, and their communication so as to pursue a simple foraging task.

Decision Making

When is a Prediction Knowledge?

no code implementations18 Apr 2019 Alex Kearney, Patrick M. Pilarski

While promising, we here suggest that the notion of predictions as knowledge in reinforcement learning is as yet underdeveloped: some work explicitly refers to predictions as knowledge, what the requirements are for considering a prediction to be knowledge have yet to be well explored.

Decision Making reinforcement-learning +1

Learning Feature Relevance Through Step Size Adaptation in Temporal-Difference Learning

no code implementations8 Mar 2019 Alex Kearney, Vivek Veeriah, Jaden Travnik, Patrick M. Pilarski, Richard S. Sutton

In this paper, we examine an instance of meta-learning in which feature relevance is learned by adapting step size parameters of stochastic gradient descent---building on a variety of prior work in stochastic approximation, machine learning, and artificial neural networks.

Meta-Learning Representation Learning

Accelerating Learning in Constructive Predictive Frameworks with the Successor Representation

no code implementations23 Mar 2018 Craig Sherstan, Marlos C. Machado, Patrick M. Pilarski

As a primary contribution of this work, we show that using SR-based predictions can improve sample efficiency and learning speed in a continual learning setting where new predictions are incrementally added and learned over time.

Continual Learning Reinforcement Learning (RL)

Communicative Capital for Prosthetic Agents

no code implementations10 Nov 2017 Patrick M. Pilarski, Richard S. Sutton, Kory W. Mathewson, Craig Sherstan, Adam S. R. Parker, Ann L. Edwards

This work presents an overarching perspective on the role that machine intelligence can play in enhancing human abilities, especially those that have been diminished due to injury or illness.

Actor-Critic Reinforcement Learning with Simultaneous Human Control and Feedback

no code implementations3 Mar 2017 Kory W. Mathewson, Patrick M. Pilarski

In many human-machine interaction settings, there is a growing gap between the degrees-of-freedom of complex semi-autonomous systems and the number of human control channels.

reinforcement-learning Reinforcement Learning (RL)

Reinforcement Learning based Embodied Agents Modelling Human Users Through Interaction and Multi-Sensory Perception

no code implementations9 Jan 2017 Kory W. Mathewson, Patrick M. Pilarski

We illustrate the impact of varying human feedback parameters on task performance by investigating the probability of giving feedback on each time step and the likelihood of given feedback being correct.

Reinforcement Learning (RL)

Simultaneous Control and Human Feedback in the Training of a Robotic Agent with Actor-Critic Reinforcement Learning

no code implementations22 Jun 2016 Kory W. Mathewson, Patrick M. Pilarski

This paper contributes a preliminary report on the advantages and disadvantages of incorporating simultaneous human control and feedback signals in the training of a reinforcement learning robotic agent.

reinforcement-learning Reinforcement Learning (RL)

Introspective Agents: Confidence Measures for General Value Functions

no code implementations17 Jun 2016 Craig Sherstan, Adam White, Marlos C. Machado, Patrick M. Pilarski

Agents of general intelligence deployed in real-world scenarios must adapt to ever-changing environmental conditions.

Face valuing: Training user interfaces with facial expressions and reinforcement learning

no code implementations9 Jun 2016 Vivek Veeriah, Patrick M. Pilarski, Richard S. Sutton

The primary objective of the current work is to demonstrate that a learning agent can reduce the amount of explicit feedback required for adapting to the user's preferences pertaining to a task by learning to perceive a value of its behavior from the human user, particularly from the user's facial expressions---we call this face valuing.

BIG-bench Machine Learning reinforcement-learning +1

True Online Temporal-Difference Learning

1 code implementation13 Dec 2015 Harm van Seijen, A. Rupam Mahmood, Patrick M. Pilarski, Marlos C. Machado, Richard S. Sutton

Our results suggest that the true online methods indeed dominate the regular methods.

Atari Games

An Empirical Evaluation of True Online TD(λ)

no code implementations1 Jul 2015 Harm van Seijen, A. Rupam Mahmood, Patrick M. Pilarski, Richard S. Sutton

Our results confirm the strength of true online TD({\lambda}): 1) for sparse feature vectors, the computational overhead with respect to TD({\lambda}) is minimal; for non-sparse features the computation time is at most twice that of TD({\lambda}), 2) across all domains/representations the learning speed of true online TD({\lambda}) is often better, but never worse than that of TD({\lambda}), and 3) true online TD({\lambda}) is easier to use, because it does not require choosing between trace types, and it is generally more stable with respect to the step-size.


Using Learned Predictions as Feedback to Improve Control and Communication with an Artificial Limb: Preliminary Findings

no code implementations8 Aug 2014 Adam S. R. Parker, Ann L. Edwards, Patrick M. Pilarski

Our study therefore contributes initial evidence that prediction learning and machine intelligence can benefit not just control, but also feedback from an artificial limb.

Temporal-Difference Learning to Assist Human Decision Making during the Control of an Artificial Limb

no code implementations18 Sep 2013 Ann L. Edwards, Alexandra Kearney, Michael Rory Dawson, Richard S. Sutton, Patrick M. Pilarski

In the present work, we explore the use of temporal-difference learning and GVFs to predict when users will switch their control influence between the different motor functions of a robot arm.

Decision Making Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.