no code implementations • 21 Oct 2020 • Joan Bas-Serrano, Sebastian Curi, Andreas Krause, Gergely Neu
We propose a new reinforcement learning algorithm derived from a regularized linear-programming formulation of optimal control in MDPs.
no code implementations • L4DC 2020 • Joan Bas-Serrano, Gergely Neu
We consider the problem of computing optimal policies in average-reward Markov decision processes.