Search Results for author: Jean Harb

Found 8 papers, 5 papers with code

Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments

84 code implementations • NeurIPS 2017 • Ryan Lowe, Yi Wu, Aviv Tamar, Jean Harb, Pieter Abbeel, Igor Mordatch

We explore deep reinforcement learning methods for multi-agent domains.

Ranked #1 on SMAC+ on Def_Infantry_sequential

Multi-agent Reinforcement Learning Q-Learning +3

30,954

Paper
Code

The Option-Critic Architecture

9 code implementations • 16 Sep 2016 • Pierre-Luc Bacon, Jean Harb, Doina Precup

Temporal abstraction is key to scaling up learning and planning in reinforcement learning.

reinforcement-learning Reinforcement Learning (RL)

165

Paper
Code

When Waiting is not an Option : Learning Options with a Deliberation Cost

1 code implementation • 14 Sep 2017 • Jean Harb, Pierre-Luc Bacon, Martin Klissarov, Doina Precup

Recent work has shown that temporally extended actions (options) can be learned fully end-to-end as opposed to being specified in advance.

Atari Games

Paper
Code

Learnings Options End-to-End for Continuous Action Tasks

3 code implementations • 30 Nov 2017 • Martin Klissarov, Pierre-Luc Bacon, Jean Harb, Doina Precup

We present new results on learning temporally extended actions for continuoustasks, using the options framework (Suttonet al.[1999b], Precup [2000]).

Paper
Code

General Policy Evaluation and Improvement by Learning to Identify Few But Crucial States

1 code implementation • 4 Jul 2022 • Francesco Faccio, Aditya Ramesh, Vincent Herrmann, Jean Harb, Jürgen Schmidhuber

In continuous control problems with infinitely many states, our value function minimizes its prediction error by simultaneously learning a small set of `probing states' and a mapping from actions produced in probing states to the policy's return.

Continuous Control Reinforcement Learning (RL) +1

Paper
Code

Investigating Recurrence and Eligibility Traces in Deep Q-Networks

no code implementations • 18 Apr 2017 • Jean Harb, Doina Precup

Eligibility traces in reinforcement learning are used as a bias-variance trade-off and can often speed up training time by propagating knowledge back over time-steps in a single update.

Atari Games reinforcement-learning +1

Paper
Add Code

The Barbados 2018 List of Open Issues in Continual Learning

no code implementations • 16 Nov 2018 • Tom Schaul, Hado van Hasselt, Joseph Modayil, Martha White, Adam White, Pierre-Luc Bacon, Jean Harb, Shibl Mourad, Marc Bellemare, Doina Precup

We want to make progress toward artificial general intelligence, namely general-purpose agents that autonomously learn how to competently act in complex environments.

Continual Learning

Paper
Add Code

Policy Evaluation Networks

no code implementations • 26 Feb 2020 • Jean Harb, Tom Schaul, Doina Precup, Pierre-Luc Bacon

The core idea of this paper is to flip this convention and estimate the value of many policies, for a single set of states.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.