no code implementations • ICML 2020 • Dan Garber, Gal Korcia, Kfir Levy
Focusing on two important families of online tasks, one which generalizes online linear and logistic regression, and the other being online PCA, we show that under standard well-conditioned-data assumptions (that are often being made in the corresponding offline settings), standard online gradient descent (OGD) methods become much more efficient in the random-order model.
no code implementations • 3 Sep 2023 • Uri Gadot, Esther Derman, Navdeep Kumar, Maxence Mohamed Elfatihi, Kfir Levy, Shie Mannor
In robust Markov decision processes (RMDPs), it is assumed that the reward and the transition dynamics lie in a given uncertainty set.
no code implementations • 9 Jun 2023 • Kaixin Wang, Uri Gadot, Navdeep Kumar, Kfir Levy, Shie Mannor
Robust Markov Decision Processes (RMDPs) provide a framework for sequential decision-making that is robust to perturbations on the transition kernel.
no code implementations • NeurIPS 2023 • Navdeep Kumar, Esther Derman, Matthieu Geist, Kfir Levy, Shie Mannor
We provide a closed-form expression for the worst occupation measure.
no code implementations • 31 Jan 2023 • Navdeep Kumar, Kfir Levy, Kaixin Wang, Shie Mannor
We present an efficient robust value iteration for \texttt{s}-rectangular robust Markov Decision Processes (MDPs) with a time complexity comparable to standard (non-robust) MDPs which is significantly faster than any existing method.
no code implementations • 3 Oct 2022 • Navdeep Kumar, Kaixin Wang, Kfir Levy, Shie Mannor
The policy gradient theorem proves to be a cornerstone in Linear RL due to its elegance and ease of implementability.
1 code implementation • 28 May 2022 • Navdeep Kumar, Kfir Levy, Kaixin Wang, Shie Mannor
But we don't have a clear understanding to exploit this equivalence, to do policy improvement steps to get the optimal value function or policy.
no code implementations • NeurIPS 2021 • Kfir Levy, Ali Kavis, Volkan Cevher
In this work we propose $\rm{STORM}^{+}$, a new method that is completely parameter-free, does not require large batch-sizes, and obtains the optimal $O(1/T^{1/3})$ rate for finding an approximate stationary point.
no code implementations • NeurIPS 2017 • Kfir Levy
We present an approach towards convex optimization that relies on a novel scheme which converts adaptive online algorithms into offline methods.
no code implementations • NeurIPS 2015 • Tomer Koren, Kfir Levy
In this setting, we establish the first evidence that ERM is able to attain fast generalization rates, and show that the expected loss of the ERM solution in $d$ dimensions converges to the optimal expected loss in a rate of $d/n$.
no code implementations • NeurIPS 2014 • Elad Hazan, Kfir Levy
Bandit Convex Optimization (BCO) is a fundamental framework for decision making under uncertainty, which generalizes many problems from the realm of online and statistical learning.