no code implementations • 3 Dec 2021 • Hepeng Li, Nicholas Clavette, Haibo He
We present an analytical policy update rule that is independent of parametric function approximators.
reinforcement-learning Reinforcement Learning (RL)