About

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Benchmarks

TREND DATASET BEST METHOD PAPER TITLE PAPER CODE COMPARE

Latest papers with code

Policy Learning with Adaptively Collected Data

5 May 2021gsbDBI/PolicyLearning

We complement this regret upper bound with a lower bound that characterizes the fundamental difficulty of policy learning with adaptive data.

MULTI-ARMED BANDITS

0
05 May 2021

Combinatorial Bandits under Strategic Manipulations

25 Feb 2021shirleydongj/StrategicCUCB

We study the problem of combinatorial multi-armed bandits (CMAB) under strategic manipulations of rewards, where each arm can modify the emitted reward signals for its own interest.

MULTI-ARMED BANDITS

2
25 Feb 2021

Federated Multi-armed Bandits with Personalization

25 Feb 2021ShenGroup/PF_MAB

A general framework of personalized federated multi-armed bandits (PF-MAB) is proposed, which is a new bandit paradigm analogous to the federated learning (FL) framework in supervised learning and enjoys the features of FL with personalization.

FEDERATED LEARNING MULTI-ARMED BANDITS

1
25 Feb 2021

Output-Weighted Sampling for Multi-Armed Bandits with Extreme Payoffs

19 Feb 2021PredictiveIntelligenceLab/jax-bandits

We present a new type of acquisition functions for online decision making in multi-armed and contextual bandit problems with extreme payoffs.

DECISION MAKING GAUSSIAN PROCESSES MULTI-ARMED BANDITS

0
19 Feb 2021

Federated Multi-Armed Bandits

28 Jan 2021ShenGroup/FMAB

We first study the approximate model where the heterogeneous local models are random realizations of the global model from an unknown distribution.

FEDERATED LEARNING MULTI-ARMED BANDITS RECOMMENDATION SYSTEMS

0
28 Jan 2021

An empirical evaluation of active inference in multi-armed bandits

21 Jan 2021dimarkov/aibandits

This comparison is done on two types of bandit problems: a stationary and a dynamic switching bandit.

DECISION MAKING DECISION MAKING UNDER UNCERTAINTY MULTI-ARMED BANDITS

4
21 Jan 2021

Relational Boosted Bandits

16 Dec 2020ashutoshaay26/Relational-Boosted-Bandits

Contextual bandits algorithms have become essential in real-world user interaction problems in recent years.

LINK PREDICTION MULTI-ARMED BANDITS

1
16 Dec 2020

Active Feature Selection for the Mutual Information Criterion

13 Dec 2020ShacharSchnapp/ActiveFeatureSelection

We study active feature selection, a novel feature selection setting in which unlabeled data is available, but the budget for labels is limited, and the examples to label can be actively selected by the algorithm.

FEATURE SELECTION MULTI-ARMED BANDITS

0
13 Dec 2020

BanditPAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits

NeurIPS 2020 ThrunGroup/BanditPAM

In these experiments, we observe that BanditPAM returns the same results as state-of-the-art PAM-like algorithms up to 4x faster while performing up to 200x fewer distance computations.

MULTI-ARMED BANDITS

16
01 Dec 2020

Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms

NeurIPS 2020 khashayarkhv/many-armed-bandit

We study the structure of regret-minimizing policies in the {\em many-armed} Bayesian multi-armed bandit problem: in particular, with $k$ the number of arms and $T$ the time horizon, we consider the case where $k \geq \sqrt{T}$.

MULTI-ARMED BANDITS

2
01 Dec 2020