no code implementations • NeurIPS 2008 • Jonathan Taylor, Doina Precup, Prakash Panagaden
We prove that the difference in the optimal value function of different states can be upper-bounded by the value of this metric, and that the bound is tighter than that provided by bisimulation metrics (Ferns et al. 2004, 2005).