Search Results for author: Drishti Wali

Feedback graph regret bounds for Thompson Sampling and UCB

We study the stochastic multi-armed bandit problem with the graph-based feedback structure introduced by Mannor and Shamir.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.