no code implementations • 10 Jan 2013 • Lex Weaver, Nigel Tao
There exist a number of reinforcement learning algorithms which learnby climbing the gradient of expected reward.
no code implementations • 10 Jan 1999 • Jonathan Baxter, Andrew Tridgell, Lex Weaver
In this paper we present TDLeaf(lambda), a variation on the TD(lambda) algorithm that enables it to be used in conjunction with game-tree search.