Search Results for author: Kishan Panaganti

Found 8 papers, 6 papers with code

Bridging Distributionally Robust Learning and Offline RL: An Approach to Mitigate Distribution Shift and Partial Data Coverage

1 code implementation • 27 Oct 2023 • Kishan Panaganti, Zaiyan Xu, Dileep Kalathil, Mohammad Ghavamzadeh

The goal of an offline reinforcement learning (RL) algorithm is to learn optimal polices using historical (offline) data, without access to the environment for online exploration.

Offline RL Reinforcement Learning (RL)

Paper
Code

Improved Sample Complexity Bounds for Distributionally Robust Reinforcement Learning

1 code implementation • 5 Mar 2023 • Zaiyan Xu, Kishan Panaganti, Dileep Kalathil

We formulate this as a distributionally robust reinforcement learning (DR-RL) problem where the objective is to learn the policy which maximizes the value function against the worst possible stochastic model of the environment in an uncertainty set.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Personalized Reward Learning with Interaction-Grounded Learning (IGL)

1 code implementation • 28 Nov 2022 • Jessica Maghakian, Paul Mineiro, Kishan Panaganti, Mark Rucker, Akanksha Saran, Cheng Tan

In an era of countless content offerings, recommender systems alleviate information overload by providing users with personalized content suggestions.

Recommendation Systems

Paper
Code

Robust Reinforcement Learning using Offline Data

1 code implementation • 10 Aug 2022 • Kishan Panaganti, Zaiyan Xu, Dileep Kalathil, Mohammad Ghavamzadeh

The goal of robust reinforcement learning (RL) is to learn a policy that is robust against the uncertainty in model parameters.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

Off-Policy Evaluation Using Information Borrowing and Context-Based Switching

1 code implementation • 18 Dec 2021 • Sutanoy Dasgupta, Yabo Niu, Kishan Panaganti, Dileep Kalathil, Debdeep Pati, Bani Mallick

We consider the off-policy evaluation (OPE) problem in contextual bandits, where the goal is to estimate the value of a target policy using the data collected by a logging policy.

Multi-Armed Bandits Off-policy evaluation

Paper
Code

Sample Complexity of Robust Reinforcement Learning with a Generative Model

1 code implementation • 2 Dec 2021 • Kishan Panaganti, Dileep Kalathil

For each of these uncertainty sets, we give a precise characterization of the sample complexity of our proposed algorithm.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Code

Robust Reinforcement Learning using Least Squares Policy Iteration with Provable Performance Guarantees

no code implementations • 20 Jun 2020 • Kishan Panaganti, Dileep Kalathil

We first propose the Robust Least Squares Policy Evaluation algorithm, which is a multi-step online model-free learning algorithm for policy evaluation.

OpenAI Gym reinforcement-learning +1

Paper
Add Code

Bounded Regret for Finitely Parameterized Multi-Armed Bandits

no code implementations • 3 Mar 2020 • Kishan Panaganti, Dileep Kalathil

We propose an algorithm that is simple and easy to implement, which we call Finitely Parameterized Upper Confidence Bound (FP-UCB) algorithm, which uses the information about the underlying parameter set for faster learning.

Multi-Armed Bandits

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.