Search Results for author: Uri Sherman

Found 7 papers, 0 papers with code

The Dimension Strikes Back with Gradients: Generalization of Gradient Methods in Stochastic Convex Optimization

no code implementations22 Jan 2024 Matan Schliserman, Uri Sherman, Tomer Koren

Our bound translates to a lower bound of $\Omega (\sqrt{d})$ on the number of training examples required for standard GD to reach a non-trivial test error, answering an open question raised by Feldman (2016) and Amir, Koren, and Livni (2021b) and showing that a non-trivial dimension dependence is unavoidable.

Rate-Optimal Policy Optimization for Linear Markov Decision Processes

no code implementations28 Aug 2023 Uri Sherman, Alon Cohen, Tomer Koren, Yishay Mansour

We study regret minimization in online episodic linear Markov Decision Processes, and obtain rate-optimal $\widetilde O (\sqrt K)$ regret where $K$ denotes the number of episodes.

Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation

no code implementations30 Jan 2023 Uri Sherman, Tomer Koren, Yishay Mansour

We study reinforcement learning with linear function approximation and adversarially changing cost functions, a setup that has mostly been considered under simplifying assumptions such as full information feedback or exploratory conditions. We present a computationally efficient policy optimization algorithm for the challenging general setting of unknown dynamics and bandit feedback, featuring a combination of mirror-descent and least squares policy evaluation in an auxiliary MDP used to compute exploration bonuses. Our algorithm obtains an $\widetilde O(K^{6/7})$ regret bound, improving significantly over previous state-of-the-art of $\widetilde O (K^{14/15})$ in this setting.

reinforcement-learning Reinforcement Learning (RL)

Regret Minimization and Convergence to Equilibria in General-sum Markov Games

no code implementations28 Jul 2022 Liad Erez, Tal Lancewicki, Uri Sherman, Tomer Koren, Yishay Mansour

Our key observation is that online learning via policy optimization in Markov games essentially reduces to a form of weighted regret minimization, with unknown weights determined by the path length of the agents' policy sequence.

Benign Underfitting of Stochastic Gradient Descent

no code implementations27 Feb 2022 Tomer Koren, Roi Livni, Yishay Mansour, Uri Sherman

We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" learning rule that achieves generalization performance by obtaining a good fit to training data.

Optimal Rates for Random Order Online Optimization

no code implementations NeurIPS 2021 Uri Sherman, Tomer Koren, Yishay Mansour

We study online convex optimization in the random order model, recently proposed by \citet{garber2020online}, where the loss functions may be chosen by an adversary, but are then presented to the online algorithm in a uniformly random order.

Lazy OCO: Online Convex Optimization on a Switching Budget

no code implementations7 Feb 2021 Uri Sherman, Tomer Koren

We study a variant of online convex optimization where the player is permitted to switch decisions at most $S$ times in expectation throughout $T$ rounds.

Cannot find the paper you are looking for? You can Submit a new open access paper.