Search Results for author: Uri Sherman

Found 7 papers, 0 papers with code

The Dimension Strikes Back with Gradients: Generalization of Gradient Methods in Stochastic Convex Optimization

no code implementations • 22 Jan 2024 • Matan Schliserman, Uri Sherman, Tomer Koren

Our bound translates to a lower bound of $\Omega (\sqrt{d})$ on the number of training examples required for standard GD to reach a non-trivial test error, answering an open question raised by Feldman (2016) and Amir, Koren, and Livni (2021b) and showing that a non-trivial dimension dependence is unavoidable.

Paper
Add Code

Rate-Optimal Policy Optimization for Linear Markov Decision Processes

no code implementations • 28 Aug 2023 • Uri Sherman, Alon Cohen, Tomer Koren, Yishay Mansour

We study regret minimization in online episodic linear Markov Decision Processes, and obtain rate-optimal $\widetilde O (\sqrt K)$ regret where $K$ denotes the number of episodes.

Paper
Add Code

Improved Regret for Efficient Online Reinforcement Learning with Linear Function Approximation

no code implementations • 30 Jan 2023 • Uri Sherman, Tomer Koren, Yishay Mansour

We study reinforcement learning with linear function approximation and adversarially changing cost functions, a setup that has mostly been considered under simplifying assumptions such as full information feedback or exploratory conditions. We present a computationally efficient policy optimization algorithm for the challenging general setting of unknown dynamics and bandit feedback, featuring a combination of mirror-descent and least squares policy evaluation in an auxiliary MDP used to compute exploration bonuses. Our algorithm obtains an $\widetilde O(K^{6/7})$ regret bound, improving significantly over previous state-of-the-art of $\widetilde O (K^{14/15})$ in this setting.

reinforcement-learning Reinforcement Learning (RL)

Paper
Add Code

Regret Minimization and Convergence to Equilibria in General-sum Markov Games

no code implementations • 28 Jul 2022 • Liad Erez, Tal Lancewicki, Uri Sherman, Tomer Koren, Yishay Mansour

Our key observation is that online learning via policy optimization in Markov games essentially reduces to a form of weighted regret minimization, with unknown weights determined by the path length of the agents' policy sequence.

Paper
Add Code

Benign Underfitting of Stochastic Gradient Descent

no code implementations • 27 Feb 2022 • Tomer Koren, Roi Livni, Yishay Mansour, Uri Sherman

We study to what extent may stochastic gradient descent (SGD) be understood as a "conventional" learning rule that achieves generalization performance by obtaining a good fit to training data.

Paper
Add Code

Optimal Rates for Random Order Online Optimization

no code implementations • NeurIPS 2021 • Uri Sherman, Tomer Koren, Yishay Mansour

We study online convex optimization in the random order model, recently proposed by \citet{garber2020online}, where the loss functions may be chosen by an adversary, but are then presented to the online algorithm in a uniformly random order.

Paper
Add Code

Lazy OCO: Online Convex Optimization on a Switching Budget

no code implementations • 7 Feb 2021 • Uri Sherman, Tomer Koren

We study a variant of online convex optimization where the player is permitted to switch decisions at most $S$ times in expectation throughout $T$ rounds.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.