no code implementations • 2 Oct 2023 • Kenny Young, Richard S. Sutton
Discovering useful temporal abstractions, in the form of options, is widely thought to be key to applying reinforcement learning and planning to increasingly complex domains.
1 code implementation • 4 Nov 2022 • Kenny Young, Aditya Ramesh, Louis Kirsch, Jürgen Schmidhuber
First, we provide a simple theorem motivating how learning a model as an intermediate step can narrow down the set of possible value functions more than learning a value function directly from data using the Bellman equation.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 4 Jul 2022 • Tian Tian, Kenny Young, Richard S. Sutton
However, Asynchronous VI still requires a maximization over the entire action space, making it impractical for domains with large action space.
no code implementations • 14 Oct 2021 • Kenny Young
We then show how HNCA can be extended to optimize a more general function of the outputs of a network of stochastic units, where the function is known to the agent.
no code implementations • 24 Nov 2020 • Kenny Young
We present Hindsight Network Credit Assignment (HNCA), a novel learning method for stochastic neural networks, which works by assigning credit to each neuron's stochastic output based on how it influences the output of its immediate children in the network.
no code implementations • 28 Oct 2020 • Kenny Young, Richard S. Sutton
We demonstrate analytically and experimentally that such pathological behaviours can impact a wide range of RL and dynamic programming algorithms; such behaviours can arise both with and without bootstrapping, and with linear function approximation as well as with more complex parameterized functions like neural networks.
no code implementations • 19 Nov 2019 • Kenny Young
Hindsight Credit Assignment (HCA) refers to a recently proposed family of methods for producing more efficient credit assignment in reinforcement learning.
3 code implementations • 7 Mar 2019 • Kenny Young, Tian Tian
With the representation learning problem simplified, we can perform experiments with significantly less computational expense.
no code implementations • 10 May 2018 • Kenny Young, Baoxiang Wang, Matthew E. Taylor
Finally, we apply Metatrace for control with nonlinear function approximation in 5 games in the Arcade Learning Environment where we explore how it impacts learning speed and robustness to initial step-size choice.
no code implementations • 25 Jan 2018 • Craig Sherstan, Brendan Bennett, Kenny Young, Dylan R. Ashley, Adam White, Martha White, Richard S. Sutton
This paper investigates estimating the variance of a temporal-difference learning agent's update target.
no code implementations • 26 Apr 2017 • Kenny Young, Ryan B. Hayward
We present Solrex, an automated solver for the game of Reverse Hex. Reverse Hex, also known as Rex, or Misere Hex, is the variant of the game of Hex in which the player who joins her two sides loses the game.
no code implementations • 24 Apr 2016 • Kenny Young, Ryan Hayward, Gautham Vasan
DeepMind's recent spectacular success in using deep convolutional neural nets and machine learning to build superhuman level agents --- e. g. for Atari games via deep Q-learning and for the game of Go via Reinforcement Learning --- raises many questions, including to what extent these methods will succeed in other domains.