no code implementations • 23 Jan 2019 • Richard Klima, Daan Bloembergen, Michael Kaisers, Karl Tuyls
We prove convergence of the operator to the optimal robust Q-function with respect to the model using the theory of Generalized Markov Decision Processes.