no code implementations • 24 Jun 2022 • James Macglashan, Evan Archer, Alisa Devlic, Takuma Seno, Craig Sherstan, Peter R. Wurman, Peter Stone
These value estimates provide insight into an agent's learning and decision-making process and enable new training methods to mitigate common problems.
no code implementations • 1 Apr 2020 • Craig Sherstan, Bilal Kartal, Pablo Hernandez-Leal, Matthew E. Taylor
Our overall conclusions are that TD-AE increases the robustness of the A2C algorithm to the trajectory length and while promising, further study is required to fully understand the relationship between auxiliary task prediction timescale and the agent's performance.
no code implementations • 18 Nov 2019 • Craig Sherstan, Shibhansh Dohare, James Macglashan, Johannes Günther, Patrick M. Pilarski
By using the timescale as one of the estimator's inputs we can estimate value for arbitrary timescales.
no code implementations • 23 Mar 2018 • Craig Sherstan, Marlos C. Machado, Patrick M. Pilarski
As a primary contribution of this work, we show that using SR-based predictions can improve sample efficiency and learning speed in a continual learning setting where new predictions are incrementally added and learned over time.
no code implementations • 25 Jan 2018 • Craig Sherstan, Brendan Bennett, Kenny Young, Dylan R. Ashley, Adam White, Martha White, Richard S. Sutton
This paper investigates estimating the variance of a temporal-difference learning agent's update target.
no code implementations • 10 Nov 2017 • Patrick M. Pilarski, Richard S. Sutton, Kory W. Mathewson, Craig Sherstan, Adam S. R. Parker, Ann L. Edwards
This work presents an overarching perspective on the role that machine intelligence can play in enhancing human abilities, especially those that have been diminished due to injury or illness.
no code implementations • 17 Jun 2016 • Craig Sherstan, Adam White, Marlos C. Machado, Patrick M. Pilarski
Agents of general intelligence deployed in real-world scenarios must adapt to ever-changing environmental conditions.