no code implementations • NeurIPS 2012 • Amir M. Farahmand, Doina Precup
VPI has two main features: First, it is a nonparametric algorithm that finds a good sparse approximation of the optimal value function given a dictionary of features.
no code implementations • NeurIPS 2008 • Amir M. Farahmand, Mohammad Ghavamzadeh, Shie Mannor, Csaba Szepesvári
In this paper we consider approximate policy-iteration-based reinforcement learning algorithms.