# Q* Approximation Schemes for Batch Reinforcement Learning: A Theoretical Comparison

9 Mar 2020Tengyang XieNan Jiang

We prove performance guarantees of two algorithms for approximating $Q^\star$ in batch reinforcement learning. Compared to classical iterative methods such as Fitted Q-Iteration---whose performance loss incurs quadratic dependence on horizon---these methods estimate (some forms of) the Bellman error and enjoy linear-in-horizon error propagation, a property established for the first time for algorithms that rely solely on batch data and output stationary policies... (read more)

