no code implementations • 16 Feb 2022 • Ziad Kobeissi, Francis Bach
We consider the problem of continuous-time policy evaluation.
reinforcement-learning Stochastic Optimization