no code implementations • 3 Jun 2011 • J. Baxter, P. L. Bartlett, L. Weaver
These algorithms are based on GPOMDP, an algorithm introduced in a companion paper (Baxter and Bartlett, this volume), which computes biased estimates of the performance gradient in POMDPs.