1 code implementation • 10 Jul 2024 • Cameron Allen, Aaron Kirtland, Ruo Yu Tao, Sam Lobel, Daniel Scott, Nicholas Petrocelli, Omer Gottesman, Ronald Parr, Michael L. Littman, George Konidaris
Our metric, the $\lambda$-discrepancy, is the difference between two distinct temporal difference (TD) value estimates, each computed using TD($\lambda$) with a different value of $\lambda$.
no code implementations • 11 May 2022 • Zhiyuan Zhou, Cameron Allen, Kavosh Asadi, George Konidaris
We study the action generalization ability of deep Q-learning in discrete action spaces.
no code implementations • 23 Oct 2021 • Omer Gottesman, Kavosh Asadi, Cameron Allen, Sam Lobel, George Konidaris, Michael Littman
We propose a new coarse-grained smoothness definition that generalizes the notion of Lipschitz continuity, is more widely applicable, and allows us to compute significantly tighter bounds on Q-functions, leading to improved learning.
no code implementations • 7 Oct 2021 • David Abel, Cameron Allen, Dilip Arumugam, D. Ellis Hershkowitz, Michael L. Littman, Lawson L. S. Wong
We address this question by proposing a simple measure of reinforcement-learning hardness called the bad-policy density.
1 code implementation • NeurIPS 2021 • Cameron Allen, Neev Parikh, Omer Gottesman, George Konidaris
A fundamental assumption of reinforcement learning in Markov decision processes (MDPs) is that the relevant decision process is, in fact, Markov.
no code implementations • 17 Oct 2020 • Michael Fishman, Nishanth Kumar, Cameron Allen, Natasha Danas, Michael Littman, Stefanie Tellex, George Konidaris
Unfortunately, planning to solve any specific task using an open-scope model is computationally intractable - even for state-of-the-art methods - due to the many states and actions that are necessarily present in the model but irrelevant to that problem.
2 code implementations • 28 Apr 2020 • Cameron Allen, Michael Katz, Tim Klinger, George Konidaris, Matthew Riemer, Gerald Tesauro
Focused macros dramatically improve black-box planning efficiency across a wide range of planning domains, sometimes beating even state-of-the-art planners with access to a full domain model.
2 code implementations • 1 Sep 2017 • Cameron Allen, Kavosh Asadi, Melrose Roderick, Abdel-rahman Mohamed, George Konidaris, Michael Littman
We propose a new algorithm, Mean Actor-Critic (MAC), for discrete-action continuous-state reinforcement learning.
Ranked #1 on
Continuous Control
on Cart Pole (OpenAI Gym)