no code implementations • 19 Oct 2021 • Raghuram Bharadwaj Diddigi, Prateek Jain, Prabuchandran K. J., Shalabh Bhatnagar
Learning optimal behavior from existing data is one of the most important problems in Reinforcement Learning (RL).
2 code implementations • 7 Jan 2021 • P. Parnika, Raghuram Bharadwaj Diddigi, Sai Koti Reddy Danda, Shalabh Bhatnagar
In this work, we consider the problem of computing optimal actions for Reinforcement Learning (RL) agents in a co-operative setting, where the objective is to optimize a common goal.
1 code implementation • 6 Feb 2020 • Shravan Nayak, Chanakya Ajit Ekbote, Annanya Pratap Singh Chauhan, Raghuram Bharadwaj Diddigi, Prishita Ray, Abhinava Sikdar, Sai Koti Reddy Danda, Shalabh Bhatnagar
A microgrid is capable of generating a limited amount of energy from a renewable resource and is responsible for handling the demands of its dedicated customers.
1 code implementation • 13 Nov 2019 • Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar
In this work, we propose a convergent on-line off-policy TD algorithm under linear function approximation.
no code implementations • 16 Jun 2019 • Raghuram Bharadwaj Diddigi, Chandramouli Kamanchi, Shalabh Bhatnagar
This problem is formulated as a min-max Markov game in the literature.
2 code implementations • 10 May 2019 • Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar
In this work, we propose a second order value iteration procedure that is obtained by applying the Newton-Raphson method to the successive relaxation value iteration scheme.
no code implementations • 9 Mar 2019 • Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Shalabh Bhatnagar
We first derive a modified fixed point iteration for SOR Q-values and utilize stochastic approximation to derive a learning algorithm to compute the optimal value function and an optimal policy.
no code implementations • 11 Feb 2019 • Chandramouli Kamanchi, Raghuram Bharadwaj Diddigi, Prabuchandran K. J., Shalabh Bhatnagar
In many of the practical applications, the analytical form of the density is not known and only the samples from the distribution are available.
no code implementations • 27 Aug 2017 • Raghuram Bharadwaj Diddigi, Prabuchandran K. J., Shalabh Bhatnagar
We consider the problem of tracking an intruder using a network of wireless sensors.
no code implementations • 25 Aug 2017 • Raghuram Bharadwaj Diddigi, D. Sai Koti Reddy, Shalabh Bhatnagar
Finally, we also consider a variant of this problem where the cost of power production at the main site is taken into consideration.