no code implementations • 20 Dec 2022 • Michael Bowling, John D. Martin, David Abel, Will Dabney
The reward hypothesis posits that, "all of what we mean by goals and purposes can be well thought of as maximization of the expected value of the cumulative sum of a received scalar signal (reward)."
no code implementations • 22 May 2022 • Esra'a Saleh, John D. Martin, Anna Koop, Arash Pourzarabi, Michael Bowling
We focus our investigations on Dyna-style planning in a prediction setting.
no code implementations • 17 Jun 2021 • John D. Martin, Joseph Modayil
However, prevailing optimization techniques are not designed for strictly-incremental online updates.
no code implementations • 2 Aug 2020 • John D. Martin, Kevin Doherty, Caralyn Cyr, Brendan Englot, John Leonard
The ability to infer map variables and estimate pose is crucial to the operation of autonomous mobile robots.
1 code implementation • 24 Jul 2020 • Fanfei Chen, John D. Martin, Yewei Huang, Jinkun Wang, Brendan Englot
We consider an autonomous exploration problem in which a range-sensing mobile robot is tasked with accurately mapping the landmarks in an a priori unknown environment efficiently in real-time; it must choose sensing actions that both curb localization uncertainty and achieve information gain.
1 code implementation • 28 Feb 2020 • William Fedus, Dibya Ghosh, John D. Martin, Marc G. Bellemare, Yoshua Bengio, Hugo Larochelle
Our study provides a clear empirical link between catastrophic interference and sample efficiency in reinforcement learning.
no code implementations • ICML 2020 • John D. Martin, Michal Lyskawinski, Xiaohu Li, Brendan Englot
We describe a new approach for managing aleatoric uncertainty in the Reinforcement Learning (RL) paradigm.
Distributional Reinforcement Learning
reinforcement-learning
+1