Search Results for author: L. Weaver

Found 1 papers, 0 papers with code

Experiments with Infinite-Horizon, Policy-Gradient Estimation

no code implementations3 Jun 2011 J. Baxter, P. L. Bartlett, L. Weaver

These algorithms are based on GPOMDP, an algorithm introduced in a companion paper (Baxter and Bartlett, this volume), which computes biased estimates of the performance gradient in POMDPs.

Cannot find the paper you are looking for? You can Submit a new open access paper.