no code implementations • NeurIPS 2010 • Mahdi M. Fard, Joelle Pineau
This paper introduces the first set of PAC-Bayesian bounds for the batch reinforcement learning problem in finite state spaces.
no code implementations • NeurIPS 2008 • Mahdi M. Fard, Joelle Pineau
Markov Decision Processes (MDPs) have been extensively studied and used in the context of planning and decision-making, and many methods exist to find the optimal policy for problems modelled as MDPs.