Search Results for author: Arghyadip Roy

Found 4 papers, 0 papers with code

A Policy Gradient Algorithm for the Risk-Sensitive Exponential Cost MDP

no code implementations8 Feb 2022 Mehrdad Moharrami, Yashaswini Murthy, Arghyadip Roy, R. Srikant

We study the risk-sensitive exponential cost MDP formulation and develop a trajectory-based gradient algorithm to find the stationary point of the cost associated with a set of parameterized policies.

Adaptive KL-UCB based Bandit Algorithms for Markovian and i.i.d. Settings

no code implementations14 Sep 2020 Arghyadip Roy, Sanjay Shakkottai, R. Srikant

rewards are a special case of Markov rewards and it is difficult to design an algorithm that works well independent of whether the underlying model is truly Markovian or i. i. d.

Online Reinforcement Learning of Optimal Threshold Policies for Markov Decision Processes

no code implementations21 Dec 2019 Arghyadip Roy, Vivek Borkar, Abhay Karandikar, Prasanna Chaporkar

To overcome the curses of dimensionality and modeling of Dynamic Programming (DP) methods to solve Markov Decision Process (MDP) problems, Reinforcement Learning (RL) methods are adopted in practice.

reinforcement-learning Reinforcement Learning (RL)

A Structure-aware Online Learning Algorithm for Markov Decision Processes

no code implementations28 Nov 2018 Arghyadip Roy, Vivek Borkar, Abhay Karandikar, Prasanna Chaporkar

In this paper, we propose a new RL algorithm which utilizes the known threshold structure of the optimal policy while learning by reducing the feasible policy space.

Management Reinforcement Learning (RL)

Cannot find the paper you are looking for? You can Submit a new open access paper.