no code implementations • 18 Jun 2012 • Steffen Grunewalder, Guy Lever, Luca Baldassarre, Massi Pontil, Arthur Gretton
For policy optimisation we compare with least-squares policy iteration where a Gaussian process is used for value function estimation.