OSOM: A simultaneously optimal algorithm for multi-armed and linear contextual bandits

24 May 2019Niladri S. ChatterjiVidya MuthukumarPeter L. Bartlett

We consider the stochastic linear (multi-armed) contextual bandit problem with the possibility of hidden \textit{simple multi-armed bandit} structure in which the rewards are independent of the contextual information. Algorithms that are designed solely for one of the regimes are known to be sub-optimal for their alternate regime... (read more)

PDF Abstract


No code implementations yet. Submit your code now

Results from the Paper

  Submit results from this paper to get state-of-the-art GitHub badges and help the community compare results to other papers.

Methods used in the Paper

🤖 No Methods Found Help the community by adding them if they're not listed; e.g. Deep Residual Learning for Image Recognition uses ResNet