Search Results for author: Alon Cohen

Found 22 papers, 0 papers with code

Locally Optimal Descent for Dynamic Stepsize Scheduling

no code implementations23 Nov 2023 Gilad Yehudai, Alon Cohen, Amit Daniely, Yoel Drori, Tomer Koren, Mariano Schain

We introduce a novel dynamic learning-rate scheduling scheme grounded in theory with the goal of simplifying the manual and time-consuming tuning of schedules in practice.

Scheduling Stochastic Optimization

Rate-Optimal Policy Optimization for Linear Markov Decision Processes

no code implementations28 Aug 2023 Uri Sherman, Alon Cohen, Tomer Koren, Yishay Mansour

We study regret minimization in online episodic linear Markov Decision Processes, and obtain rate-optimal $\widetilde O (\sqrt K)$ regret where $K$ denotes the number of episodes.

APART: Diverse Skill Discovery using All Pairs with Ascending Reward and DropouT

no code implementations24 Aug 2023 Hadar Schreiber Galler, Tom Zahavy, Guillaume Desjardins, Alon Cohen

This problem is formulated as mutual training of skills using an intrinsic reward and a discriminator trained to predict a skill given its trajectory.

Efficient Rate Optimal Regret for Adversarial Contextual MDPs Using Online Function Approximation

no code implementations2 Mar 2023 Orin Levy, Alon Cohen, Asaf Cassel, Yishay Mansour

To the best of our knowledge, our algorithm is the first efficient rate optimal regret minimization algorithm for adversarial CMDPs that operates under the minimal standard assumption of online function approximation.

regression

Eluder-based Regret for Stochastic Contextual MDPs

no code implementations27 Nov 2022 Orin Levy, Asaf Cassel, Alon Cohen, Yishay Mansour

To the best of our knowledge, our algorithm is the first efficient and rate-optimal regret minimization algorithm for CMDPs that operates under the general offline function approximation setting.

regression

Rate-Optimal Online Convex Optimization in Adaptive Linear Control

no code implementations3 Jun 2022 Asaf Cassel, Alon Cohen, Tomer Koren

We consider the problem of controlling an unknown linear dynamical system under adversarially changing convex costs and full feedback of both the state and cost function.

Efficient Online Linear Control with Stochastic Convex Costs and Unknown Dynamics

no code implementations2 Mar 2022 Asaf Cassel, Alon Cohen, Tomer Koren

We consider the problem of controlling an unknown linear dynamical system under a stochastic convex cost and full feedback of both the state and cost function.

Asynchronous Stochastic Optimization Robust to Arbitrary Delays

no code implementations NeurIPS 2021 Alon Cohen, Amit Daniely, Yoel Drori, Tomer Koren, Mariano Schain

In the general non-convex smooth optimization setting, we give a simple and efficient algorithm that requires $O( \sigma^2/\epsilon^4 + \tau/\epsilon^2 )$ steps for finding an $\epsilon$-stationary point $x$, where $\tau$ is the \emph{average} delay $\smash{\frac{1}{T}\sum_{t=1}^T d_t}$ and $\sigma^2$ is the variance of the stochastic gradients.

Distributed Optimization

Minimax Regret for Stochastic Shortest Path

no code implementations NeurIPS 2021 Alon Cohen, Yonathan Efroni, Yishay Mansour, Aviv Rosenberg

In this work we show that the minimax regret for this setting is $\widetilde O(\sqrt{ (B_\star^2 + B_\star) |S| |A| K})$ where $B_\star$ is a bound on the expected cost of the optimal policy from any state, $S$ is the state space, and $A$ is the action space.

Online Markov Decision Processes with Aggregate Bandit Feedback

no code implementations31 Jan 2021 Alon Cohen, Haim Kaplan, Tomer Koren, Yishay Mansour

We study a novel variant of online finite-horizon Markov Decision Processes with adversarially changing loss functions and initially unknown dynamics.

Near-optimal Regret Bounds for Stochastic Shortest Path

no code implementations ICML 2020 Alon Cohen, Haim Kaplan, Yishay Mansour, Aviv Rosenberg

In this work we remove this dependence on the minimum cost---we give an algorithm that guarantees a regret bound of $\widetilde{O}(B_\star |S| \sqrt{|A| K})$, where $B_\star$ is an upper bound on the expected cost of the optimal policy, $S$ is the set of states, $A$ is the set of actions and $K$ is the number of episodes.

Reinforcement Learning (RL)

Logarithmic Regret for Learning Linear Quadratic Regulators Efficiently

no code implementations ICML 2020 Asaf Cassel, Alon Cohen, Tomer Koren

We consider the problem of learning in Linear Quadratic Control systems whose transition parameters are initially unknown.

Apprenticeship Learning via Frank-Wolfe

no code implementations5 Nov 2019 Tom Zahavy, Alon Cohen, Haim Kaplan, Yishay Mansour

Specifically, we show that a variation of the FW method that is based on taking "away steps" achieves a linear rate of convergence when applied to AL and that a stochastic version of the FW algorithm can be used to avoid precise estimation of feature expectations.

Unknown mixing times in apprenticeship and reinforcement learning

no code implementations23 May 2019 Tom Zahavy, Alon Cohen, Haim Kaplan, Yishay Mansour

We derive and analyze learning algorithms for apprenticeship learning, policy evaluation, and policy gradient for average reward criteria.

reinforcement-learning Reinforcement Learning (RL)

Learning Linear-Quadratic Regulators Efficiently with only $\sqrt{T}$ Regret

no code implementations17 Feb 2019 Alon Cohen, Tomer Koren, Yishay Mansour

We present the first computationally-efficient algorithm with $\widetilde O(\sqrt{T})$ regret for learning in Linear Quadratic Control systems with unknown dynamics.

Open-Ended Question Answering

Learning to Screen

no code implementations NeurIPS 2019 Alon Cohen, Avinatan Hassidim, Haim Kaplan, Yishay Mansour, Shay Moran

(ii) In the second variant it is assumed that before the process starts, the algorithm has an access to a training set of $n$ items drawn independently from the same unknown distribution (e. g.\ data of candidates from previous recruitment seasons).

Learning Approximately Optimal Contracts

no code implementations16 Nov 2018 Alon Cohen, Moran Koren, Argyrios Deligkas

Furthermore, we show that when there are only two possible outcomes or the agent is risk-neutral, the algorithm's outcome approximates the optimal contract described in the classical theory.

Online Linear Quadratic Control

no code implementations ICML 2018 Alon Cohen, Avinatan Hassidim, Tomer Koren, Nevena Lazic, Yishay Mansour, Kunal Talwar

We study the problem of controlling linear time-invariant systems with known noisy dynamics and adversarially chosen quadratic losses.

Online Learning with Many Experts

no code implementations25 Feb 2017 Alon Cohen, Shie Mannor

We study the problem of prediction with expert advice when the number of experts in question may be extremely large or even infinite.

Tight Bounds for Bandit Combinatorial Optimization

no code implementations24 Feb 2017 Alon Cohen, Tamir Hazan, Tomer Koren

We revisit the study of optimal regret rates in bandit combinatorial optimization---a fundamental framework for sequential decision making under uncertainty that abstracts numerous combinatorial prediction problems.

Combinatorial Optimization Decision Making +1

Online Learning with Feedback Graphs Without the Graphs

no code implementations23 May 2016 Alon Cohen, Tamir Hazan, Tomer Koren

We study an online learning framework introduced by Mannor and Shamir (2011) in which the feedback is specified by a graph, in a setting where the graph may vary from round to round and is \emph{never fully revealed} to the learner.

Cannot find the paper you are looking for? You can Submit a new open access paper.