Search Results for author: Asaf Cassel

Found 8 papers, 0 papers with code

Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation

no code implementations2 Mar 2023 Orin Levy, Alon Cohen, Asaf Cassel, Yishay Mansour

To the best of our knowledge, our algorithm is the first efficient rate optimal regret minimization algorithm for adversarial CMDPs that operates under the minimal standard assumption of online function approximation.

regression

Eluder-based Regret for Stochastic Contextual MDPs

no code implementations27 Nov 2022 Orin Levy, Asaf Cassel, Alon Cohen, Yishay Mansour

To the best of our knowledge, our algorithm is the first efficient and rate-optimal regret minimization algorithm for CMDPs that operates under the general offline function approximation setting.

regression

Rate-Optimal Online Convex Optimization in Adaptive Linear Control

no code implementations3 Jun 2022 Asaf Cassel, Alon Cohen, Tomer Koren

We consider the problem of controlling an unknown linear dynamical system under adversarially changing convex costs and full feedback of both the state and cost function.

Efficient Online Linear Control with Stochastic Convex Costs and Unknown Dynamics

no code implementations2 Mar 2022 Asaf Cassel, Alon Cohen, Tomer Koren

We consider the problem of controlling an unknown linear dynamical system under a stochastic convex cost and full feedback of both the state and cost function.

Online Policy Gradient for Model Free Learning of Linear Quadratic Regulators with $\sqrt{T}$ Regret

no code implementations25 Feb 2021 Asaf Cassel, Tomer Koren

We consider the task of learning to control a linear dynamical system under fixed quadratic costs, known as the Linear Quadratic Regulator (LQR) problem.

Bandit Linear Control

no code implementations NeurIPS 2020 Asaf Cassel, Tomer Koren

We consider the problem of controlling a known linear dynamical system under stochastic noise, adversarially chosen costs, and bandit feedback.

Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently

no code implementations ICML 2020 Asaf Cassel, Alon Cohen, Tomer Koren

We consider the problem of learning in Linear Quadratic Control systems whose transition parameters are initially unknown.

A General Framework for Bandit Problems Beyond Cumulative Objectives

no code implementations4 Jun 2018 Asaf Cassel, Shie Mannor, Assaf Zeevi

Unlike the case of cumulative criteria, in the problems we study here the oracle policy, that knows the problem parameters a priori and is used to "center" the regret, is not trivial.

Multi-Armed Bandits

Cannot find the paper you are looking for? You can Submit a new open access paper.