no code implementations • NeurIPS 2010 • Jonathan Sorg, Richard L. Lewis, Satinder P. Singh
In this work, we develop a gradient ascent approach with formal convergence guarantees for approximately solving the optimal reward problem online during an agent's lifetime.
no code implementations • NeurIPS 2008 • Erik Talvitie, Satinder P. Singh
We present a novel mathematical formalism for the idea of a local model,'' a model of a potentially complex dynamical system that makes only certain predictions in only certain situations.