Search Results for author: Tal Lancewicki

Found 7 papers, 0 papers with code

Regret Minimization and Convergence to Equilibria in General-sum Markov Games

no code implementations28 Jul 2022 Liad Erez, Tal Lancewicki, Uri Sherman, Tomer Koren, Yishay Mansour

Our key observation is that online learning via policy optimization in Markov games essentially reduces to a form of weighted regret minimization, with unknown weights determined by the path length of the agents' policy sequence.

Cooperative Online Learning in Stochastic and Adversarial MDPs

no code implementations31 Jan 2022 Tal Lancewicki, Aviv Rosenberg, Yishay Mansour

We study cooperative online learning in stochastic and adversarial Markov decision process (MDP).

Reinforcement Learning (RL)

Stochastic Multi-Armed Bandits with Unrestricted Delay Distributions

no code implementations4 Jun 2021 Tal Lancewicki, Shahar Segal, Tomer Koren, Yishay Mansour

We study the stochastic Multi-Armed Bandit (MAB) problem with random delays in the feedback received by the algorithm.

Multi-Armed Bandits

Learning Adversarial Markov Decision Processes with Delayed Feedback

no code implementations29 Dec 2020 Tal Lancewicki, Aviv Rosenberg, Yishay Mansour

We present novel algorithms based on policy optimization that achieve near-optimal high-probability regret of $\widetilde O ( \sqrt{K} + \sqrt{D} )$ under full-information feedback, where $K$ is the number of episodes and $D = \sum_{k} d^k$ is the total delay.

Recommendation Systems

Cannot find the paper you are looking for? You can Submit a new open access paper.