Search Results for author: Scott Fujimoto

Found 11 papers, 8 papers with code

A Deep Reinforcement Learning Approach to Marginalized Importance Sampling with the Successor Representation

1 code implementation12 Jun 2021 Scott Fujimoto, David Meger, Doina Precup

We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy.

A Minimalist Approach to Offline Reinforcement Learning

2 code implementations12 Jun 2021 Scott Fujimoto, Shixiang Shane Gu

Offline reinforcement learning (RL) defines the task of learning from a fixed batch of data.

Offline RL

Practical Marginalized Importance Sampling with the Successor Representation

no code implementations1 Jan 2021 Scott Fujimoto, David Meger, Doina Precup

We bridge the gap between MIS and deep reinforcement learning by observing that the density ratio can be computed from the successor representation of the target policy.

An Equivalence between Loss Functions and Non-Uniform Sampling in Experience Replay

1 code implementation NeurIPS 2020 Scott Fujimoto, David Meger, Doina Precup

Prioritized Experience Replay (PER) is a deep reinforcement learning technique in which agents learn from transitions sampled with non-uniform probability proportionate to their temporal-difference error.

Benchmarking Batch Deep Reinforcement Learning Algorithms

3 code implementations3 Oct 2019 Scott Fujimoto, Edoardo Conti, Mohammad Ghavamzadeh, Joelle Pineau

Widely-used deep reinforcement learning algorithms have been shown to fail in the batch setting--learning from a fixed data set without interaction with the environment.

Q-Learning

Off-Policy Deep Reinforcement Learning without Exploration

8 code implementations7 Dec 2018 Scott Fujimoto, David Meger, Doina Precup

Many practical applications of reinforcement learning constrain agents to learn from a fixed batch of data which has already been gathered, without offering further possibility for data collection.

Continuous Control

Addressing Function Approximation Error in Actor-Critic Methods

42 code implementations ICML 2018 Scott Fujimoto, Herke van Hoof, David Meger

In value-based reinforcement learning methods such as deep Q-learning, function approximation errors are known to lead to overestimated value estimates and suboptimal policies.

OpenAI Gym Q-Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.