no code implementations • 24 Jun 2022 • James Macglashan, Evan Archer, Alisa Devlic, Takuma Seno, Craig Sherstan, Peter R. Wurman, Peter Stone
These value estimates provide insight into an agent's learning and decision-making process and enable new training methods to mitigate common problems.
no code implementations • 18 Nov 2019 • Craig Sherstan, Shibhansh Dohare, James Macglashan, Johannes Günther, Patrick M. Pilarski
By using the timescale as one of the estimator's inputs we can estimate value for arbitrary timescales.
1 code implementation • 20 Nov 2017 • Melrose Roderick, James Macglashan, Stefanie Tellex
The Deep Q-Network proposed by Mnih et al. [2015] has become a benchmark and building point for much deep reinforcement learning research.
no code implementations • 14 Apr 2017 • Michael L. Littman, Ufuk Topcu, Jie Fu, Charles Isbell, Min Wen, James Macglashan
We propose a new task-specification language for Markov decision processes that is designed to be an improvement over reward functions by being environment independent.
no code implementations • ICML 2017 • James MacGlashan, Mark K. Ho, Robert Loftin, Bei Peng, Guan Wang, David Roberts, Matthew E. Taylor, Michael L. Littman
This paper investigates the problem of interactively learning behaviors communicated by a human teacher using positive and negative feedback.
no code implementations • NeurIPS 2016 • Mark K. Ho, Michael Littman, James Macglashan, Fiery Cushman, Joseph L. Austerweil
Stark differences arise when demonstrators are intentionally teaching a task versus simply performing a task.