no code implementations • 4 Oct 2023 • Bekzhan Kerimkulov, James-Michael Leahy, David Siska, Lukasz Szpruch, Yufei Zhang
We study the global convergence of a Fisher-Rao policy gradient flow for infinite-horizon entropy-regularised Markov decision processes with Polish state and action space.
no code implementations • 18 Jan 2022 • Bekzhan Kerimkulov, James-Michael Leahy, David Šiška, Lukasz Szpruch
We show that the objective function is increasing along the gradient flow.