no code implementations • 15 Feb 2022 • Sarah Sachs, Hédi Hadiji, Tim van Erven, Cristóbal Guzmán
case, our bounds match the rates one would expect from results in stochastic acceleration, and in the fully adversarial case they gracefully deteriorate to match the minimax regret.
no code implementations • 11 Feb 2022 • Jack J. Mayo, Hédi Hadiji, Tim van Erven
We follow up on this observation by showing that there is in fact never a price to pay for adaptivity if we specialise to any of the other common supervised online learning losses: our results cover log loss, (linear and non-parametric) logistic regression, square loss prediction, and (linear and non-parametric) least-squares regression.
no code implementations • 15 Feb 2021 • Dirk van der Hoeven, Hédi Hadiji, Tim van Erven
Each round, an adversary first activates one of the agents to issue a prediction and provides a corresponding gradient, and then the agents are allowed to send a $b$-bit message to their neighbors in the graph.
no code implementations • 5 Oct 2020 • Hédi Hadiji, Sébastien Gerchinovitz, Jean-Michel Loubes, Gilles Stoltz
We consider the bandit-based framework for diversity-preserving recommendations introduced by Celis et al. (2019), who approached it in the case of a polytope mainly by a reduction to the setting of linear bandits.
no code implementations • 5 Jun 2020 • Hédi Hadiji, Gilles Stoltz
We consider stochastic bandit problems with $K$ arms, each associated with a bounded distribution supported on the range $[m, M]$.
no code implementations • 24 May 2019 • Hédi Hadiji
In the context of stochastic continuum-armed bandits, we present an algorithm that adapts to the unknown smoothness of the objective function.
1 code implementation • 14 May 2018 • Aurélien Garivier, Hédi Hadiji, Pierre Menard, Gilles Stoltz
We were able to obtain this non-parametric bi-optimality result while working hard to streamline the proofs (of previously known regret bounds and thus of the new analyses carried out); a second merit of the present contribution is therefore to provide a review of proofs of classical regret bounds for index-based strategies for $K$-armed stochastic bandits.