Search Results for author: Jean Harb

Found 8 papers, 5 papers with code

General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States

1 code implementation4 Jul 2022 Francesco Faccio, Aditya Ramesh, Vincent Herrmann, Jean Harb, Jürgen Schmidhuber

In continuous control problems with infinitely many states, our value function minimizes its prediction error by simultaneously learning a small set of `probing states' and a mapping from actions produced in probing states to the policy's return.

Continuous Control Reinforcement Learning (RL) +1

Policy Evaluation Networks

no code implementations26 Feb 2020 Jean Harb, Tom Schaul, Doina Precup, Pierre-Luc Bacon

The core idea of this paper is to flip this convention and estimate the value of many policies, for a single set of states.

The Barbados 2018 List of Open Issues in Continual Learning

no code implementations16 Nov 2018 Tom Schaul, Hado van Hasselt, Joseph Modayil, Martha White, Adam White, Pierre-Luc Bacon, Jean Harb, Shibl Mourad, Marc Bellemare, Doina Precup

We want to make progress toward artificial general intelligence, namely general-purpose agents that autonomously learn how to competently act in complex environments.

Continual Learning

Learnings Options End-to-End for Continuous Action Tasks

3 code implementations30 Nov 2017 Martin Klissarov, Pierre-Luc Bacon, Jean Harb, Doina Precup

We present new results on learning temporally extended actions for continuoustasks, using the options framework (Suttonet al.[1999b], Precup [2000]).

When Waiting is not an Option : Learning Options with a Deliberation Cost

1 code implementation14 Sep 2017 Jean Harb, Pierre-Luc Bacon, Martin Klissarov, Doina Precup

Recent work has shown that temporally extended actions (options) can be learned fully end-to-end as opposed to being specified in advance.

Atari Games

Investigating Recurrence and Eligibility Traces in Deep Q-Networks

no code implementations18 Apr 2017 Jean Harb, Doina Precup

Eligibility traces in reinforcement learning are used as a bias-variance trade-off and can often speed up training time by propagating knowledge back over time-steps in a single update.

Atari Games reinforcement-learning +1

The Option-Critic Architecture

9 code implementations16 Sep 2016 Pierre-Luc Bacon, Jean Harb, Doina Precup

Temporal abstraction is key to scaling up learning and planning in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.