Search Results for author: Nikolai Karpov

Found 7 papers, 0 papers with code

Communication-Efficient Collaborative Regret Minimization in Multi-Armed Bandits

no code implementations26 Jan 2023 Nikolai Karpov, Qin Zhang

In this paper, we study the collaborative learning model, which concerns the tradeoff between parallelism and communication overhead in multi-agent multi-armed bandits.

Multi-agent Reinforcement Learning Multi-Armed Bandits +2

Communication-Efficient Collaborative Best Arm Identification

no code implementations18 Aug 2022 Nikolai Karpov, Qin Zhang

We investigate top-$m$ arm identification, a basic problem in bandit theory, in a multi-agent learning model in which agents collaborate to learn an objective function.

Parallel Best Arm Identification in Heterogeneous Environments

no code implementations16 Jul 2022 Nikolai Karpov, Qin Zhang

In this paper, we study the tradeoffs between the time and the number of communication rounds of the best arm identification problem in the heterogeneous collaborative learning model, where multiple agents interact with possibly different environments and they want to learn in parallel an objective function in the aggregated environment.

Multi-Armed Bandits

Batched Thompson Sampling for Multi-Armed Bandits

no code implementations15 Aug 2021 Nikolai Karpov, Qin Zhang

We study Thompson Sampling algorithms for stochastic multi-armed bandits in the batched setting, in which we want to minimize the regret over a sequence of arm pulls using a small number of policy changes (or, batches).

Multi-Armed Bandits Thompson Sampling

Instance-Sensitive Algorithms for Pure Exploration in Multinomial Logit Bandit

no code implementations2 Dec 2020 Nikolai Karpov, Qin Zhang

Motivated by real-world applications such as fast fashion retailing and online advertising, the Multinomial Logit Bandit (MNL-bandit) is a popular model in online learning and operations research, and has attracted much attention in the past decade.

Batched Coarse Ranking in Multi-Armed Bandits

no code implementations NeurIPS 2020 Nikolai Karpov, Qin Zhang

We study the problem of coarse ranking in the multi-armed bandits (MAB) setting, where we have a set of arms each of which is associated with an unknown distribution.

Multi-Armed Bandits

Collaborative Top Distribution Identifications with Limited Interaction

no code implementations20 Apr 2020 Nikolai Karpov, Qin Zhang, Yuan Zhou

We give optimal time-round tradeoffs, as well as demonstrate complexity separations between top-$1$ arm identification and top-$m$ arm identifications for general $m$ and between fixed-time and fixed-confidence variants.

Cannot find the paper you are looking for? You can Submit a new open access paper.