no code implementations • 29 Sep 2021 • Romain Laroche, Othmane Safsafi, Raphael Feraud, Nicolas Broutin
In Batched Multi-Armed Bandits (BMAB), the policy is not allowed to be updated at each time step.
no code implementations • 10 May 2017 • Djallel Bouneffouf, Irina Rish, Guillermo A. Cecchi, Raphael Feraud
We consider a novel formulation of the multi-armed bandit model, which we call the contextual bandit with restricted context, where only a limited number of features can be accessed by the learner at every iteration.
no code implementations • ICLR 2018 • Romain Laroche, Raphael Feraud
This paper formalises the problem of online algorithm selection in the context of Reinforcement Learning.
no code implementations • 29 Sep 2014 • Robin Allesiardo, Raphael Feraud, Djallel Bouneffouf
This paper presents a new contextual bandit algorithm, NeuralBandit, which does not need hypothesis on stationarity of contexts and rewards.