Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

TREND DATASET BEST METHOD PAPER TITLE PAPER CODE COMPARE

# Policy Learning with Adaptively Collected Data

5 May 2021gsbDBI/PolicyLearning

We complement this regret upper bound with a lower bound that characterizes the fundamental difficulty of policy learning with adaptive data.

0
05 May 2021

# Combinatorial Bandits under Strategic Manipulations

25 Feb 2021shirleydongj/StrategicCUCB

We study the problem of combinatorial multi-armed bandits (CMAB) under strategic manipulations of rewards, where each arm can modify the emitted reward signals for its own interest.

2
25 Feb 2021

# Federated Multi-armed Bandits with Personalization

25 Feb 2021ShenGroup/PF_MAB

A general framework of personalized federated multi-armed bandits (PF-MAB) is proposed, which is a new bandit paradigm analogous to the federated learning (FL) framework in supervised learning and enjoys the features of FL with personalization.

1
25 Feb 2021

# Output-Weighted Sampling for Multi-Armed Bandits with Extreme Payoffs

19 Feb 2021PredictiveIntelligenceLab/jax-bandits

We present a new type of acquisition functions for online decision making in multi-armed and contextual bandit problems with extreme payoffs.

0
19 Feb 2021

# Federated Multi-Armed Bandits

28 Jan 2021ShenGroup/FMAB

We first study the approximate model where the heterogeneous local models are random realizations of the global model from an unknown distribution.

0
28 Jan 2021

# An empirical evaluation of active inference in multi-armed bandits

21 Jan 2021dimarkov/aibandits

This comparison is done on two types of bandit problems: a stationary and a dynamic switching bandit.

4
21 Jan 2021

# Relational Boosted Bandits

16 Dec 2020ashutoshaay26/Relational-Boosted-Bandits

Contextual bandits algorithms have become essential in real-world user interaction problems in recent years.

1
16 Dec 2020

# Active Feature Selection for the Mutual Information Criterion

13 Dec 2020ShacharSchnapp/ActiveFeatureSelection

We study active feature selection, a novel feature selection setting in which unlabeled data is available, but the budget for labels is limited, and the examples to label can be actively selected by the algorithm.

0
13 Dec 2020

# BanditPAM: Almost Linear Time k-Medoids Clustering via Multi-Armed Bandits

In these experiments, we observe that BanditPAM returns the same results as state-of-the-art PAM-like algorithms up to 4x faster while performing up to 200x fewer distance computations.

16
01 Dec 2020

# Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms

We study the structure of regret-minimizing policies in the {\em many-armed} Bayesian multi-armed bandit problem: in particular, with $k$ the number of arms and $T$ the time horizon, we consider the case where $k \geq \sqrt{T}$.

2
01 Dec 2020