no code implementations • 10 Feb 2024 • Yuriy Dorn, Aleksandr Katrutsa, Ilgam Latypov, Andrey Pudovikov
In this study, we propose a new method for constructing UCB-type algorithms for stochastic multi-armed bandits based on general convex optimization methods with an inexact oracle.
1 code implementation • 11 May 2023 • Yuriy Dorn, Nikita Kornilov, Nikolay Kutuzov, Alexander Nazin, Eduard Gorbunov, Alexander Gasnikov
We establish convergence results under mild assumptions on the rewards distribution and demonstrate that INF-clip is optimal for linear heavy-tailed stochastic MAB problems and works well for non-linear ones.