Search Results for author: Navdeep Kumar

Found 8 papers, 1 papers with code

On the Global Convergence of Policy Gradient in Average Reward Markov Decision Processes

no code implementations • 11 Mar 2024 • Navdeep Kumar, Yashaswini Murthy, Itai Shufaro, Kfir Y. Levy, R. Srikant, Shie Mannor

We present the first finite time global convergence analysis of policy gradient in the context of infinite horizon average reward Markov decision processes (MDPs).

Paper
Add Code

Solving Non-Rectangular Reward-Robust MDPs via Frequency Regularization

no code implementations • 3 Sep 2023 • Uri Gadot, Esther Derman, Navdeep Kumar, Maxence Mohamed Elfatihi, Kfir Levy, Shie Mannor

In robust Markov decision processes (RMDPs), it is assumed that the reward and the transition dynamics lie in a given uncertainty set.

Paper
Add Code

Bring Your Own (Non-Robust) Algorithm to Solve Robust MDPs by Estimating The Worst Kernel

no code implementations • 9 Jun 2023 • Kaixin Wang, Uri Gadot, Navdeep Kumar, Kfir Levy, Shie Mannor

Robust Markov Decision Processes (RMDPs) provide a framework for sequential decision-making that is robust to perturbations on the transition kernel.

Decision Making reinforcement-learning +1

Paper
Add Code

Policy Gradient for Rectangular Robust Markov Decision Processes

no code implementations • NeurIPS 2023 • Navdeep Kumar, Esther Derman, Matthieu Geist, Kfir Levy, Shie Mannor

We provide a closed-form expression for the worst occupation measure.

Policy Gradient Methods

Paper
Add Code

An Efficient Solution to s-Rectangular Robust Markov Decision Processes

no code implementations • 31 Jan 2023 • Navdeep Kumar, Kfir Levy, Kaixin Wang, Shie Mannor

We present an efficient robust value iteration for \texttt{s}-rectangular robust Markov Decision Processes (MDPs) with a time complexity comparable to standard (non-robust) MDPs which is significantly faster than any existing method.

LEMMA

Paper
Add Code

Policy Gradient for Reinforcement Learning with General Utilities

no code implementations • 3 Oct 2022 • Navdeep Kumar, Kaixin Wang, Kfir Levy, Shie Mannor

The policy gradient theorem proves to be a cornerstone in Linear RL due to its elegance and ease of implementability.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Efficient Policy Iteration for Robust Markov Decision Processes via Regularization

1 code implementation • 28 May 2022 • Navdeep Kumar, Kfir Levy, Kaixin Wang, Shie Mannor

But we don't have a clear understanding to exploit this equivalence, to do policy improvement steps to get the optimal value function or policy.

Paper
Code

The Geometry of Robust Value Functions

no code implementations • 30 Jan 2022 • Kaixin Wang, Navdeep Kumar, Kuangqi Zhou, Bryan Hooi, Jiashi Feng, Shie Mannor

The key of this perspective is to decompose the value space, in a state-wise manner, into unions of hypersurfaces.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.