1 code implementation • 1 Apr 2021 • Dylan Ashley, Anssi Kanervisto, Brendan Bennett
We present AlphaChute: a state-of-the-art algorithm that achieves superhuman performance in the ancient game of Chutes and Ladders.
no code implementations • 5 Jul 2019 • Brendan Bennett, Wesley Chung, Muhammad Zaheer, Vincent Liu
Temporal difference methods enable efficient estimation of value functions in reinforcement learning in an incremental fashion, and are of broader interest because they correspond learning as observed in biological systems.
no code implementations • 20 Sep 2018 • Kristopher De Asis, Brendan Bennett, Richard S. Sutton
Temporal difference (TD) learning is an important approach in reinforcement learning, as it combines ideas from dynamic programming and Monte Carlo methods in a way that allows for online and incremental model-free learning.
no code implementations • 25 Jan 2018 • Craig Sherstan, Brendan Bennett, Kenny Young, Dylan R. Ashley, Adam White, Martha White, Richard S. Sutton
This paper investigates estimating the variance of a temporal-difference learning agent's update target.