Browse > Miscellaneous > Multi-Armed Bandits

Multi-Armed Bandits

11 papers with code · Miscellaneous

Multi-armed bandits refer to a task where a fixed amount of resources must be allocated between competing resources that maximizes expected gain. Typically these problems involve an exploration/exploitation trade-off.

State-of-the-art leaderboards

Greatest papers with code

Deep Bayesian Bandits Showdown: An Empirical Comparison of Bayesian Deep Networks for Thompson Sampling

ICLR 2018 tensorflow/models

At the same time, advances in approximate Bayesian methods have made posterior approximation for flexible neural network models practical.

DECISION MAKING MULTI-ARMED BANDITS

Taming the Monster: A Fast and Simple Algorithm for Contextual Bandits

4 Feb 2014VowpalWabbit/vowpal_wabbit

We present a new algorithm for the contextual bandit learning problem, where the learner repeatedly takes one of $K$ actions in response to the observed context, and observes the reward only for that chosen action.

MULTI-ARMED BANDITS

Adapting multi-armed bandits policies to contextual bandits scenarios

11 Nov 2018david-cortes/contextualbandits

This work explores adaptations of successful multi-armed bandits policies to the online contextual bandits scenario with binary rewards using binary classification algorithms such as logistic regression as black-box oracles.

MULTI-ARMED BANDITS

Learning Structural Weight Uncertainty for Sequential Decision-Making

30 Dec 2017zhangry868/S2VGD

Learning probability distributions on the weights of neural networks (NNs) has recently proven beneficial in many applications.

DECISION MAKING MULTI-ARMED BANDITS

A Survey on Contextual Multi-armed Bandits

13 Aug 2015yanyangbaobeiIsEmma/Reinforcement-Learning-Contextual-Bandits

In this survey we cover a few stochastic and adversarial contextual bandit algorithms.

MULTI-ARMED BANDITS

Thompson Sampling for Contextual Bandits with Linear Payoffs

15 Sep 2012yanyangbaobeiIsEmma/Reinforcement-Learning-Contextual-Bandits

Thompson Sampling is one of the oldest heuristics for multi-armed bandit problems.

MULTI-ARMED BANDITS

The Assistive Multi-Armed Bandit

24 Jan 2019chanlaw/assistive-bandits

Learning preferences implicit in the choices humans make is a well studied problem in both economics and computer science.

MULTI-ARMED BANDITS

On-line Adaptative Curriculum Learning for GANs

31 Jul 2018Byte7/Adaptive-Curriculum-GAN-keras

We argue that less expressive discriminators are smoother and have a general coarse grained view of the modes map, which enforces the generator to cover a wide portion of the data distribution support.

MULTI-ARMED BANDITS STOCHASTIC OPTIMIZATION

Heteroscedastic Bandits with Reneging

29 Oct 2018Xi-Liu/heteroscedasticbandits

Although shown to be useful in many areas as models for solving sequential decision problems with side observations (contexts), contextual bandits are subject to two major limitations.

MULTI-ARMED BANDITS

Contextual Bandits with Stochastic Experts

23 Feb 2018rajatsen91/CB_StochasticExperts

We consider the problem of contextual bandits with stochastic experts, which is a variation of the traditional stochastic contextual bandit with experts problem.

MULTI-ARMED BANDITS