Search Results for author: Lex Weaver

The Optimal Reward Baseline for Gradient-Based Reinforcement Learning

There exist a number of reinforcement learning algorithms which learnby climbing the gradient of expected reward.

Paper
Add Code

In this paper we present TDLeaf(lambda), a variation on the TD(lambda) algorithm that enables it to be used in conjunction with game-tree search.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.