Policy Gradient Methods for Reinforcement Learning with Function Approximation and Action-Dependent Baselines

NeurIPS 1999  ·  Philip S. Thomas, Emma Brunskill ·

We show how an action-dependent baseline can be used by the policy gradient theorem using function approximation, originally presented with action-independent baselines by (Sutton et al. 2000).

PDF Abstract

Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here