Search Results for author: Dileep Kalathil

Found 13 papers, 1 papers with code

PyProD: A Machine Learning-Friendly Platform for Protection Analytics in Distribution Systems

no code implementations13 Sep 2021 Dongqi Wu, Dileep Kalathil, Miroslav Begovic, Le Xie

This paper introduces PyProD, a Python-based machine learning (ML)-compatible test-bed for evaluating the efficacy of protection schemes in electric distribution grids.

Decision Making

Learning Policies with Zero or Bounded Constraint Violation for Constrained MDPs

no code implementations4 Jun 2021 Tao Liu, Ruida Zhou, Dileep Kalathil, P. R. Kumar, Chao Tian

We show that when a strictly safe policy is known, then one can confine the system to zero constraint violation with arbitrarily high probability while keeping the reward regret of order $\tilde{\mathcal{O}}(\sqrt{K})$.

Safe Exploration

Fully Decentralized Reinforcement Learning-based Control of Photovoltaics in Distribution Grids for Joint Provision of Real and Reactive Power

no code implementations3 Aug 2020 Rayan El Helou, Dileep Kalathil, Le Xie

In this paper, we introduce a new framework to address the problem of voltage regulation in unbalanced distribution grids with deep photovoltaic penetration.

Learning with Safety Constraints: Sample Complexity of Reinforcement Learning for Constrained MDPs

no code implementations1 Aug 2020 Aria HasanzadeZonuzy, Archana Bura, Dileep Kalathil, Srinivas Shakkottai

Many physical systems have underlying safety considerations that require that the policy employed ensures the satisfaction of a set of constraints.

Reinforcement Learning for Mean Field Games with Strategic Complementarities

no code implementations21 Jun 2020 Kiyeob Lee, Desik Rengarajan, Dileep Kalathil, Srinivas Shakkottai

We introduce a natural refinement to the equilibrium concept that we call Trembling-Hand-Perfect MFE (T-MFE), which allows agents to employ a measure of randomization while accounting for the impact of such randomization on their payoffs.

Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees

no code implementations20 Jun 2020 Kishan Panaganti, Dileep Kalathil

We first propose the Robust Least Squares Policy Evaluation algorithm, which is a multi-step online model-free learning algorithm for policy evaluation.

OpenAI Gym

Deep Reinforcement Learning-BasedRobust Protection in DER-Rich Distribution Grids

no code implementations5 Mar 2020 Dongqi Wu, Dileep Kalathil, Miroslav Begovic, Le Xie

This paper introduces the concept of Deep Reinforcement Learning based architecture for protective relay design in power distribution systems with many distributed energy resources (DERs).

Bounded Regret for Finitely Parameterized Multi-Armed Bandits

no code implementations3 Mar 2020 Kishan Panaganti, Dileep Kalathil

We propose an algorithm that is simple and easy to implement, which we call Finitely Parameterized Upper Confidence Bound (FP-UCB) algorithm, which uses the information about the underlying parameter set for faster learning.

Multi-Armed Bandits

Decoupled Data Based Approach for Learning to Control Nonlinear Dynamical Systems

1 code implementation17 Apr 2019 Ran Wang, Karthikeya Parunandi, Dan Yu, Dileep Kalathil, Suman Chakravorty

This paper proposes a novel decoupled data-based control (D2C) algorithm that addresses this problem using a decoupled, `open loop - closed loop', approach.

QFlow: A Learning Approach to High QoE Video Streaming at the Wireless Edge

no code implementations4 Jan 2019 Rajarshi Bhattacharyya, Archana Bura, Desik Rengarajan, Mason Rumuly, Bainan Xia, Srinivas Shakkottai, Dileep Kalathil, Ricky K. P. Mok, Amogh Dhamdhere

The predominant use of wireless access networks is for media streaming applications, which are only gaining popularity as ever more devices become available for this purpose.

On Regret-Optimal Learning in Decentralized Multi-player Multi-armed Bandits

no code implementations4 May 2015 Naumaan Nayyar, Dileep Kalathil, Rahul Jain

The objective is to design a policy that maximizes the expected reward over a time horizon for a single player setting and the sum of expected rewards for the multiplayer setting.

Multi-Armed Bandits

Empirical Q-Value Iteration

no code implementations30 Nov 2014 Dileep Kalathil, Vivek S. Borkar, Rahul Jain

We propose a new simple and natural algorithm for learning the optimal Q-value function of a discounted-cost Markov Decision Process (MDP) when the transition kernels are unknown.

Q-Learning

Approachability in Stackelberg Stochastic Games with Vector Costs

no code implementations3 Nov 2014 Dileep Kalathil, Vivek Borkar, Rahul Jain

Firstly, we give a simple and computationally tractable strategy for approachability for Stackelberg stochastic games along the lines of Blackwell's.

Decision Making

Cannot find the paper you are looking for? You can Submit a new open access paper.