Search Results for author: Naman Saxena

Off-Policy Average Reward Actor-Critic with Deterministic Policy Search

In this work, we present both on-policy and off-policy deterministic policy gradient theorems for the average reward performance criterion.

Paper
Add Code

We show, in theory and through experiments, that our algorithm updates have low variance, and the training loss reduces in a smooth manner.

Paper
Add Code

Signal Temporal Logic (STL) is a powerful framework for describing the complex temporal and logical behaviour of the dynamical system.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.