Search Results for author: Vivek Borkar

Found 12 papers, 0 papers with code

Tabular and Deep Learning for the Whittle Index

no code implementations4 Jun 2024 Francisco Robledo Relaño, Vivek Borkar, Urtzi Ayesta, Konstantin Avrachenkov

The Whittle index policy is a heuristic that has shown remarkably good performance (with guaranteed asymptotic optimality) when applied to the class of problems known as Restless Multi-Armed Bandit Problems (RMABPs).

Deep Learning Q-Learning

An Asymptotic CVaR Measure of Risk for Markov Chains

no code implementations22 May 2024 Shivam Patel, Vivek Borkar

Risk sensitive decision making finds important applications in current day use cases.

Decision Making Density Estimation

Full Gradient Deep Reinforcement Learning for Average-Reward Criterion

no code implementations7 Apr 2023 Tejas Pagare, Vivek Borkar, Konstantin Avrachenkov

We extend the provably convergent Full Gradient DQN algorithm for discounted reward Markov decision processes from Avrachenkov et al. (2021) to average reward problems.

Deep Reinforcement Learning Multi-Armed Bandits +2

A Concentration Bound for Distributed Stochastic Approximation

no code implementations9 Oct 2022 Harsh Dolhare, Vivek Borkar

We revisit the classical model of Tsitsiklis, Bertsekas and Athans for distributed stochastic approximation with consensus.

Concentration bounds for SSP Q-learning for average cost MDPs

no code implementations7 Jun 2022 Shaan ul Haque, Vivek Borkar

We derive a concentration bound for a Q-learning algorithm for average cost Markov decision processes based on an equivalent shortest path problem, and compare it numerically with the alternative scheme based on relative value iteration.

Q-Learning

The ODE Method for Asymptotic Statistics in Stochastic Approximation and Reinforcement Learning

no code implementations27 Oct 2021 Vivek Borkar, Shuhang Chen, Adithya Devraj, Ioannis Kontoyiannis, Sean Meyn

The paper concerns the $d$-dimensional stochastic approximation recursion, $$ \theta_{n+1}= \theta_n + \alpha_{n + 1} f(\theta_n, \Phi_{n+1}) $$ where $ \{ \Phi_n \}$ is a stochastic process on a general state space, satisfying a conditional Markov property that allows for parameter-dependent noise.

reinforcement-learning Reinforcement Learning (RL)

A Unified Batch Selection Policy for Active Metric Learning

no code implementations15 Feb 2021 Priyadarshini K, Siddhartha Chaudhuri, Vivek Borkar, Subhasis Chaudhuri

To avoid redundancy between triplets, our method collectively selects batches with maximum joint entropy, which simultaneously captures both informativeness and diversity.

Active Learning Diversity +3

Online Reinforcement Learning of Optimal Threshold Policies for Markov Decision Processes

no code implementations21 Dec 2019 Arghyadip Roy, Vivek Borkar, Abhay Karandikar, Prasanna Chaporkar

To overcome the curses of dimensionality and modeling of Dynamic Programming (DP) methods to solve Markov Decision Process (MDP) problems, Reinforcement Learning (RL) methods are adopted in practice.

reinforcement-learning Reinforcement Learning +1

A Structure-aware Online Learning Algorithm for Markov Decision Processes

no code implementations28 Nov 2018 Arghyadip Roy, Vivek Borkar, Abhay Karandikar, Prasanna Chaporkar

In this paper, we propose a new RL algorithm which utilizes the known threshold structure of the optimal policy while learning by reducing the feasible policy space.

Management Reinforcement Learning +1

Approachability in Stackelberg Stochastic Games with Vector Costs

no code implementations3 Nov 2014 Dileep Kalathil, Vivek Borkar, Rahul Jain

Firstly, we give a simple and computationally tractable strategy for approachability for Stackelberg stochastic games along the lines of Blackwell's.

Decision Making Reinforcement Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.