Search Results for author: Brett Daley

Found 13 papers, 7 papers with code

Compound Returns Reduce Variance in Reinforcement Learning

no code implementations6 Feb 2024 Brett Daley, Martha White, Marlos C. Machado

Multistep returns, such as $n$-step returns and $\lambda$-returns, are commonly used to improve the sample efficiency of reinforcement learning (RL) methods.

reinforcement-learning Reinforcement Learning (RL)

Trajectory-Aware Eligibility Traces for Off-Policy Reinforcement Learning

1 code implementation26 Jan 2023 Brett Daley, Martha White, Christopher Amato, Marlos C. Machado

Off-policy learning from multistep returns is crucial for sample-efficient reinforcement learning, but counteracting off-policy bias without exacerbating variance is challenging.

reinforcement-learning Reinforcement Learning (RL)

Adaptive Tree Backup Algorithms for Temporal-Difference Reinforcement Learning

no code implementations4 Jun 2022 Brett Daley, Isaac Chan

Q($\sigma$) is a recently proposed temporal-difference learning method that interpolates between learning from expected backups and sampled backups.

reinforcement-learning Reinforcement Learning (RL)

Improving the Efficiency of Off-Policy Reinforcement Learning by Accounting for Past Decisions

no code implementations23 Dec 2021 Brett Daley, Christopher Amato

Off-policy learning from multistep returns is crucial for sample-efficient reinforcement learning, particularly in the experience replay setting now commonly used with deep neural networks.

reinforcement-learning Reinforcement Learning (RL)

Virtual Replay Cache

1 code implementation6 Dec 2021 Brett Daley, Christopher Amato

Return caching is a recent strategy that enables efficient minibatch training with multistep estimators (e. g. the {\lambda}-return) for deep reinforcement learning.

Atari Games reinforcement-learning +1

Human-Level Control without Server-Grade Hardware

1 code implementation1 Nov 2021 Brett Daley, Christopher Amato

Deep Q-Network (DQN) marked a major milestone for reinforcement learning, demonstrating for the first time that human-level control policies could be learned directly from raw visual inputs via reward maximization.

Cloud Computing reinforcement-learning +1

Investigating Alternatives to the Root Mean Square for Adaptive Gradient Methods

no code implementations10 Jun 2021 Brett Daley, Christopher Amato

Adam is an adaptive gradient method that has experienced widespread adoption due to its fast and reliable training performance.

Stratified Experience Replay: Correcting Multiplicity Bias in Off-Policy Reinforcement Learning

no code implementations22 Feb 2021 Brett Daley, Cameron Hickert, Christopher Amato

Our theory prescribes a special non-uniform distribution to cancel this effect, and we propose a stratified sampling scheme to efficiently implement it.

reinforcement-learning Reinforcement Learning (RL)

Contrasting Centralized and Decentralized Critics in Multi-Agent Reinforcement Learning

no code implementations8 Feb 2021 Xueguang Lyu, Yuchen Xiao, Brett Daley, Christopher Amato

Centralized Training for Decentralized Execution, where agents are trained offline using centralized information but execute in a decentralized manner online, has gained popularity in the multi-agent reinforcement learning community.

Misconceptions Multi-agent Reinforcement Learning +2

Belief-Grounded Networks for Accelerated Robot Learning under Partial Observability

1 code implementation19 Oct 2020 Hai Nguyen, Brett Daley, Xinchao Song, Christopher Amato, Robert Platt

Many important robotics problems are partially observable in the sense that a single visual or force-feedback measurement is insufficient to reconstruct the state.

Expectigrad: Fast Stochastic Optimization with Robust Convergence Properties

1 code implementation3 Oct 2020 Brett Daley, Christopher Amato

Many popular adaptive gradient methods such as Adam and RMSProp rely on an exponential moving average (EMA) to normalize their stepsizes.

Stochastic Optimization

Reconciling λ-Returns with Experience Replay

1 code implementation NeurIPS 2019 Brett Daley, Christopher Amato

Modern deep reinforcement learning methods have departed from the incremental learning required for eligibility traces, rendering the implementation of the λ-return difficult in this context.

Atari Games Incremental Learning

Reconciling $λ$-Returns with Experience Replay

1 code implementation23 Oct 2018 Brett Daley, Christopher Amato

Modern deep reinforcement learning methods have departed from the incremental learning required for eligibility traces, rendering the implementation of the $\lambda$-return difficult in this context.

Atari Games Incremental Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.