Search Results for author: Subhojyoti Khastigir

Off-Policy Average Reward Actor-Critic with Deterministic Policy Search

In this work, we present both on-policy and off-policy deterministic policy gradient theorems for the average reward performance criterion.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.