no code implementations • 20 May 2023 • Naman Saxena, Subhojyoti Khastigir, Shishir Kolathaya, Shalabh Bhatnagar
In this work, we present both on-policy and off-policy deterministic policy gradient theorems for the average reward performance criterion.
no code implementations • 20 May 2023 • Arunselvan Ramaswamy, Shalabh Bhatnagar, Naman Saxena
We show, in theory and through experiments, that our algorithm updates have low variance, and the training loss reduces in a smooth manner.
no code implementations • 30 Nov 2022 • Naman Saxena, Gorantla Sandeep, Pushpak Jagtap
Signal Temporal Logic (STL) is a powerful framework for describing the complex temporal and logical behaviour of the dynamical system.