Search Results for author: David Abel

Found 20 papers, 5 papers with code

Settling the Reward Hypothesis

no code implementations20 Dec 2022 Michael Bowling, John D. Martin, David Abel, Will Dabney

The reward hypothesis posits that, "all of what we mean by goals and purposes can be well thought of as maximization of the expected value of the cumulative sum of a received scalar signal (reward)."

Meta-Gradients in Non-Stationary Environments

no code implementations13 Sep 2022 Jelena Luketina, Sebastian Flennerhag, Yannick Schroecker, David Abel, Tom Zahavy, Satinder Singh

We support these results with a qualitative analysis of resulting meta-parameter schedules and learned functions of context features.

A Theory of Abstraction in Reinforcement Learning

no code implementations1 Mar 2022 David Abel

Reinforcement learning defines the problem facing agents that learn to make good decisions through action and observation alone.

reinforcement-learning Reinforcement Learning (RL)

On the Expressivity of Markov Reward

no code implementations NeurIPS 2021 David Abel, Will Dabney, Anna Harutyunyan, Mark K. Ho, Michael L. Littman, Doina Precup, Satinder Singh

We then provide a set of polynomial-time algorithms that construct a Markov reward function that allows an agent to optimize tasks of each of these three types, and correctly determine when no such reward function exists.

People construct simplified mental representations to plan

no code implementations14 May 2021 Mark K. Ho, David Abel, Carlos G. Correa, Michael L. Littman, Jonathan D. Cohen, Thomas L. Griffiths

We propose a computational account of this simplification process and, in a series of pre-registered behavioral experiments, show that it is subject to online cognitive control and that people optimally balance the complexity of a task representation and its utility for planning and acting.

Revisiting Peng's Q($λ$) for Modern Reinforcement Learning

no code implementations27 Feb 2021 Tadashi Kozuno, Yunhao Tang, Mark Rowland, Rémi Munos, Steven Kapturowski, Will Dabney, Michal Valko, David Abel

These results indicate that Peng's Q($\lambda$), which was thought to be unsafe, is a theoretically-sound and practically effective algorithm.

Continuous Control reinforcement-learning +1

What can I do here? A Theory of Affordances in Reinforcement Learning

1 code implementation ICML 2020 Khimya Khetarpal, Zafarali Ahmed, Gheorghe Comanici, David Abel, Doina Precup

Gibson (1977) coined the term "affordances" to describe the fact that certain states enable an agent to do certain actions, in the context of embodied agents.

reinforcement-learning Reinforcement Learning (RL)

The Efficiency of Human Cognition Reflects Planned Information Processing

no code implementations13 Feb 2020 Mark K. Ho, David Abel, Jonathan D. Cohen, Michael L. Littman, Thomas L. Griffiths

Thus, people should plan their actions, but they should also be smart about how they deploy resources used for planning their actions.

Learning State Abstractions for Transfer in Continuous Control

2 code implementations8 Feb 2020 Kavosh Asadi, David Abel, Michael L. Littman

In this work, we answer this question in the affirmative, where we take "simple learning algorithm" to be tabular Q-Learning, the "good representations" to be a learned state abstraction, and "challenging problems" to be continuous control tasks.

Continuous Control Q-Learning +2

Lipschitz Lifelong Reinforcement Learning

1 code implementation15 Jan 2020 Erwan Lecarpentier, David Abel, Kavosh Asadi, Yuu Jinnai, Emmanuel Rachelson, Michael L. Littman

We consider the problem of knowledge transfer when an agent is facing a series of Reinforcement Learning (RL) tasks.

reinforcement-learning Reinforcement Learning (RL) +1

Mitigating Planner Overfitting in Model-Based Reinforcement Learning

no code implementations3 Dec 2018 Dilip Arumugam, David Abel, Kavosh Asadi, Nakul Gopalan, Christopher Grimm, Jun Ki Lee, Lucas Lehnert, Michael L. Littman

An agent with an inaccurate model of its environment faces a difficult choice: it can ignore the errors in its model and act in the real world in whatever way it determines is optimal with respect to its model.

Model-based Reinforcement Learning reinforcement-learning +1

Finding Options that Minimize Planning Time

no code implementations16 Oct 2018 Yuu Jinnai, David Abel, D. Ellis Hershkowitz, Michael Littman, George Konidaris

We formalize the problem of selecting the optimal set of options for planning as that of computing the smallest set of options so that planning converges in less than a given maximum of value-iteration passes.

Policy and Value Transfer in Lifelong Reinforcement Learning

no code implementations ICML 2018 David Abel, Yuu Jinnai, Sophie Yue Guo, George Konidaris, Michael Littman

We consider the problem of how best to use prior experience to bootstrap lifelong learning, where an agent faces a series of task instances drawn from some task distribution.

reinforcement-learning Reinforcement Learning (RL)

State Abstractions for Lifelong Reinforcement Learning

no code implementations ICML 2018 David Abel, Dilip Arumugam, Lucas Lehnert, Michael Littman

We introduce two new classes of abstractions: (1) transitive state abstractions, whose optimal form can be computed efficiently, and (2) PAC state abstractions, which are guaranteed to hold with respect to a distribution of tasks.

reinforcement-learning Reinforcement Learning (RL)

Agent-Agnostic Human-in-the-Loop Reinforcement Learning

no code implementations15 Jan 2017 David Abel, John Salvatier, Andreas Stuhlmüller, Owain Evans

Providing Reinforcement Learning agents with expert advice can dramatically improve various aspects of learning.

reinforcement-learning Reinforcement Learning (RL)

Near Optimal Behavior via Approximate State Abstraction

1 code implementation15 Jan 2017 David Abel, D. Ellis Hershkowitz, Michael L. Littman

The combinatorial explosion that plagues planning and reinforcement learning (RL) algorithms can be moderated using state abstraction.

reinforcement-learning Reinforcement Learning (RL)

Exploratory Gradient Boosting for Reinforcement Learning in Complex Domains

1 code implementation14 Mar 2016 David Abel, Alekh Agarwal, Fernando Diaz, Akshay Krishnamurthy, Robert E. Schapire

We address both of these challenges with two complementary techniques: First, we develop a gradient-boosting style, non-parametric function approximator for learning on $Q$-function residuals.

reinforcement-learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.