Search Results for author: Vivek Borkar

Found 10 papers, 0 papers with code

Full Gradient Deep Reinforcement Learning for Average-Reward Criterion

no code implementations7 Apr 2023 Tejas Pagare, Vivek Borkar, Konstantin Avrachenkov

We extend the provably convergent Full Gradient DQN algorithm for discounted reward Markov decision processes from Avrachenkov et al. (2021) to average reward problems.

Multi-Armed Bandits Q-Learning +1

A Concentration Bound for Distributed Stochastic Approximation

no code implementations9 Oct 2022 Harsh Dolhare, Vivek Borkar

We revisit the classical model of Tsitsiklis, Bertsekas and Athans for distributed stochastic approximation with consensus.

Concentration bounds for SSP Q-learning for average cost MDPs

no code implementations7 Jun 2022 Shaan ul Haque, Vivek Borkar

We derive a concentration bound for a Q-learning algorithm for average cost Markov decision processes based on an equivalent shortest path problem, and compare it numerically with the alternative scheme based on relative value iteration.

Q-Learning

The ODE Method for Asymptotic Statistics in Stochastic Approximation and Reinforcement Learning

no code implementations27 Oct 2021 Vivek Borkar, Shuhang Chen, Adithya Devraj, Ioannis Kontoyiannis, Sean Meyn

In addition to standard Lipschitz assumptions and conditions on the vanishing step-size sequence, it is assumed that the associated \textit{mean flow} $ \tfrac{d}{dt} \vartheta_t = \bar{f}(\vartheta_t)$, is globally asymptotically stable with stationary point denoted $\theta^*$, where $\bar{f}(\theta)=\text{ E}[f(\theta,\Phi)]$ with $\Phi$ having the stationary distribution of the chain.

reinforcement-learning Reinforcement Learning (RL)

A Unified Batch Selection Policy for Active Metric Learning

no code implementations15 Feb 2021 Priyadarshini K, Siddhartha Chaudhuri, Vivek Borkar, Subhasis Chaudhuri

To avoid redundancy between triplets, our method collectively selects batches with maximum joint entropy, which simultaneously captures both informativeness and diversity.

Active Learning Informativeness +1

Online Reinforcement Learning of Optimal Threshold Policies for Markov Decision Processes

no code implementations21 Dec 2019 Arghyadip Roy, Vivek Borkar, Abhay Karandikar, Prasanna Chaporkar

To overcome the curses of dimensionality and modeling of Dynamic Programming (DP) methods to solve Markov Decision Process (MDP) problems, Reinforcement Learning (RL) methods are adopted in practice.

reinforcement-learning Reinforcement Learning (RL)

A Structure-aware Online Learning Algorithm for Markov Decision Processes

no code implementations28 Nov 2018 Arghyadip Roy, Vivek Borkar, Abhay Karandikar, Prasanna Chaporkar

In this paper, we propose a new RL algorithm which utilizes the known threshold structure of the optimal policy while learning by reducing the feasible policy space.

Management Reinforcement Learning (RL)

Approachability in Stackelberg Stochastic Games with Vector Costs

no code implementations3 Nov 2014 Dileep Kalathil, Vivek Borkar, Rahul Jain

Firstly, we give a simple and computationally tractable strategy for approachability for Stackelberg stochastic games along the lines of Blackwell's.

Decision Making

Cannot find the paper you are looking for? You can Submit a new open access paper.