Search Results for author: Paul Wagner

Found 2 papers, 0 papers with code

Optimistic policy iteration and natural actor-critic: A unifying view and a non-optimality result

no code implementations • NeurIPS 2013 • Paul Wagner

Approximate dynamic programming approaches to the reinforcement learning problem are often categorized into greedy value function methods and value-based policy gradient methods.

Policy Gradient Methods

Paper
Add Code

A reinterpretation of the policy oscillation phenomenon in approximate policy iteration

no code implementations • NeurIPS 2011 • Paul Wagner

We take a fresh view to this phenomenon by casting a considerable subset of the former approach as a limiting special case of the latter.

Policy Gradient Methods

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.