A Fully Polynomial Time Approximation Scheme for Constrained MDPs and Stochastic Shortest Path under Local Transitions

10 Apr 2022  ·  Majid Khonji ·

The fixed-horizon constrained Markov Decision Process (C-MDP) is a well-known model for planning in stochastic environments under operating constraints. Chance-Constrained MDP (CC-MDP) is a variant that allows bounding the probability of constraint violation, which is desired in many safety-critical applications. CC-MDP can also model a class of MDPs, called Stochastic Shortest Path (SSP), under dead-ends, where there is a trade-off between the probability-to-goal and cost-to-goal. This work studies the structure of (C)C-MDP, particularly an important variant that involves local transition. In this variant, the state reachability exhibits a certain degree of locality and independence from the remaining states. More precisely, the number of states, at a given time, that share some reachable future states is always constant. (C)C-MDP under local transition is NP-Hard even for a planning horizon of two. In this work, we propose a fully polynomial-time approximation scheme for (C)C-MDP that computes (near) optimal deterministic policies. Such an algorithm is among the best approximation algorithm attainable in theory and gives insights into the approximability of constrained MDP and its variants.

PDF Abstract
No code implementations yet. Submit your code now

Tasks


Datasets


  Add Datasets introduced or used in this paper

Results from the Paper


  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods


No methods listed for this paper. Add relevant methods here