no code implementations • 2 Mar 2023 • Orin Levy, Alon Cohen, Asaf Cassel, Yishay Mansour
To the best of our knowledge, our algorithm is the first efficient rate optimal regret minimization algorithm for adversarial CMDPs that operates under the minimal standard assumption of online function approximation.
no code implementations • 27 Nov 2022 • Orin Levy, Asaf Cassel, Alon Cohen, Yishay Mansour
To the best of our knowledge, our algorithm is the first efficient and rate-optimal regret minimization algorithm for CMDPs that operates under the general offline function approximation setting.
no code implementations • 3 Jun 2022 • Asaf Cassel, Alon Cohen, Tomer Koren
We consider the problem of controlling an unknown linear dynamical system under adversarially changing convex costs and full feedback of both the state and cost function.
no code implementations • 2 Mar 2022 • Asaf Cassel, Alon Cohen, Tomer Koren
We consider the problem of controlling an unknown linear dynamical system under a stochastic convex cost and full feedback of both the state and cost function.
no code implementations • 25 Feb 2021 • Asaf Cassel, Tomer Koren
We consider the task of learning to control a linear dynamical system under fixed quadratic costs, known as the Linear Quadratic Regulator (LQR) problem.
no code implementations • NeurIPS 2020 • Asaf Cassel, Tomer Koren
We consider the problem of controlling a known linear dynamical system under stochastic noise, adversarially chosen costs, and bandit feedback.
no code implementations • ICML 2020 • Asaf Cassel, Alon Cohen, Tomer Koren
We consider the problem of learning in Linear Quadratic Control systems whose transition parameters are initially unknown.
no code implementations • 4 Jun 2018 • Asaf Cassel, Shie Mannor, Assaf Zeevi
Unlike the case of cumulative criteria, in the problems we study here the oracle policy, that knows the problem parameters a priori and is used to "center" the regret, is not trivial.