no code implementations • 28 Dec 2022 • Chris Nota
Many popular policy gradient methods for reinforcement learning follow a biased approximation of the policy gradient known as the discounted approximation.
no code implementations • 6 Jan 2020 • Francisco M. Garcia, Chris Nota, Philip S. Thomas
Reinforcement learning (RL) has become an increasingly active area of research in recent years.
no code implementations • 17 Jun 2019 • Chris Nota, Philip S. Thomas
The policy gradient theorem describes the gradient of the expected discounted return with respect to an agent's policy parameters.
no code implementations • 6 Jun 2019 • Philip S. Thomas, Scott M. Jordan, Yash Chandak, Chris Nota, James Kostas
We propose a new objective function for finite-horizon episodic Markov decision processes that better captures Bellman's principle of optimality, and provide an expression for the gradient of the objective.
1 code implementation • 5 Jun 2019 • Yash Chandak, Georgios Theocharous, Chris Nota, Philip S. Thomas
have been well-studied in the lifelong learning literature, the setting where the action set changes remains unaddressed.
no code implementations • ICML 2020 • James E. Kostas, Chris Nota, Philip S. Thomas
Coagent policy gradient algorithms (CPGAs) are reinforcement learning algorithms for training a class of stochastic neural networks called coagent networks.
Hierarchical Reinforcement Learning reinforcement-learning +1