no code implementations • 16 Sep 2020 • Kumar Ashutosh, Sarthak Consul, Bhishma Dedhia, Parthasarathi Khirwadkar, Sahil Shah, Shivaram Kalyanakrishnan
An important theoretical question is how many iterations a specified PI variant will take to terminate as a function of the number of states $n$ and the number of actions $k$ in the input MDP.
no code implementations • 28 Nov 2019 • Sarthak Consul, Bhishma Dedhia, Kumar Ashutosh, Parthasarathi Khirwadkar
We generalize the previous result and prove a novel exponential lower bound on the number of iterations taken by policy iteration for $N-$state, $k-$action MDPs.