Multi-Armed Bandits

196 papers with code • 1 benchmarks • 2 datasets

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

( Image credit: Microsoft Research )

Benchmarks

Add a Result

These leaderboards are used to track progress in Multi-Armed Bandits

Trend	Dataset	Best Model	Paper	Code	Compare
	Mushroom	Linear FullPosterior-MR			See all

Libraries

Use these libraries to find Multi-Armed Bandits models and implementations

facebookresearch/Horizon

2 papers

3,523

facebookresearch/ReAgent

2 papers

3,523

st-tech/zr-obp

2 papers

615

Datasets

Latest papers with no code

Most implemented Social Latest No code

Contextual Restless Multi-Armed Bandits with Application to Demand Response Decision-Making

no code yet • 22 Mar 2024

This paper introduces a novel multi-armed bandits framework, termed Contextual Restless Bandits (CRB), for complex online decision-making.

Paper
Add Code

Transfer in Sequential Multi-armed Bandits via Reward Samples

no code yet • 19 Mar 2024

We consider a sequential stochastic multi-armed bandit problem where the agent interacts with bandit over multiple episodes.

Paper
Add Code

Phasic Diversity Optimization for Population-Based Reinforcement Learning

no code yet • 17 Mar 2024

Furthermore, we construct a dogfight scenario for aerial agents to demonstrate the practicality of the PDO algorithm.

Paper
Add Code

ε-Neural Thompson Sampling of Deep Brain Stimulation for Parkinson Disease Treatment

no code yet • 11 Mar 2024

Traditional commercial DBS devices are only able to deliver fixed-frequency periodic pulses to the basal ganglia (BG) regions of the brain, i. e., continuous DBS (cDBS).

Paper
Add Code

Efficient Public Health Intervention Planning Using Decomposition-Based Decision-Focused Learning

no code yet • 8 Mar 2024

However, the availability and time of these health workers are limited resources.

Paper
Add Code

LC-Tsallis-INF: Generalized Best-of-Both-Worlds Linear Contextual Bandits

no code yet • 5 Mar 2024

For this issue, this study proposes an algorithm whose regret satisfies $O(\log(T))$ in the setting when the suboptimality gap is lower-bounded.

Paper
Add Code

Adaptive Learning Rate for Follow-the-Regularized-Leader: Competitive Analysis and Best-of-Both-Worlds

no code yet • 1 Mar 2024

Follow-The-Regularized-Leader (FTRL) is known as an effective and versatile approach in online learning, where appropriate choice of the learning rate is crucial for smaller regret.

Paper
Add Code

Investigating Gender Fairness in Machine Learning-driven Personalized Care for Chronic Pain

no code yet • 29 Feb 2024

In this article, we study gender fairness in personalized pain care recommendations using a real-world application of reinforcement learning (Piette et al., 2022a).

Paper
Add Code

Federated Linear Contextual Bandits with Heterogeneous Clients

no code yet • 29 Feb 2024

The demand for collaborative and private bandit learning across multiple agents is surging due to the growing quantity of data generated from distributed systems.

Paper
Add Code

Batched Nonparametric Contextual Bandits

no code yet • 27 Feb 2024

We study nonparametric contextual bandits under batch constraints, where the expected reward for each action is modeled as a smooth function of covariates, and the policy updates are made at the end of each batch of observations.

Paper
Add Code

Multi-Armed Bandits

Benchmarks Add a Result

Libraries

Datasets

Latest papers with no code

Content

Benchmarks

Add a Result