no code implementations • 26 Jan 2023 • Nikolai Karpov, Qin Zhang
In this paper, we study the collaborative learning model, which concerns the tradeoff between parallelism and communication overhead in multi-agent multi-armed bandits.
no code implementations • 18 Aug 2022 • Nikolai Karpov, Qin Zhang
We investigate top-$m$ arm identification, a basic problem in bandit theory, in a multi-agent learning model in which agents collaborate to learn an objective function.
no code implementations • 16 Jul 2022 • Nikolai Karpov, Qin Zhang
In this paper, we study the tradeoffs between the time and the number of communication rounds of the best arm identification problem in the heterogeneous collaborative learning model, where multiple agents interact with possibly different environments and they want to learn in parallel an objective function in the aggregated environment.
no code implementations • 15 Aug 2021 • Nikolai Karpov, Qin Zhang
We study Thompson Sampling algorithms for stochastic multi-armed bandits in the batched setting, in which we want to minimize the regret over a sequence of arm pulls using a small number of policy changes (or, batches).
no code implementations • 2 Dec 2020 • Nikolai Karpov, Qin Zhang
Motivated by real-world applications such as fast fashion retailing and online advertising, the Multinomial Logit Bandit (MNL-bandit) is a popular model in online learning and operations research, and has attracted much attention in the past decade.
no code implementations • NeurIPS 2020 • Nikolai Karpov, Qin Zhang
We study the problem of coarse ranking in the multi-armed bandits (MAB) setting, where we have a set of arms each of which is associated with an unknown distribution.
no code implementations • 20 Apr 2020 • Nikolai Karpov, Qin Zhang, Yuan Zhou
We give optimal time-round tradeoffs, as well as demonstrate complexity separations between top-$1$ arm identification and top-$m$ arm identifications for general $m$ and between fixed-time and fixed-confidence variants.