Deep Bayesian Bandits: Exploring in Online Personalized Recommendations

no code implementations3 Aug 2020 Dalin Guo, Sofia Ira Ktena, Ferenc Huszar, Pranay Kumar Myana, Wenzhe Shi, Alykhan Tejani

Recommender systems trained in a continuous learning fashion are plagued by the feedback loop problem, also known as algorithmic bias.

Recommendation Systems

Why so gloomy? A Bayesian explanation of human pessimism bias in the multi-armed bandit task

no code implementations NeurIPS 2018 Dalin Guo, Angela J. Yu

We find that the "pessimism bias" in the bandit task is well captured by the prior mean of DBM when fitted to human choices; but it is poorly captured by the prior mean of the Fixed Belief Model (FBM), an alternative Bayesian model that (correctly) assumes reward rates to be constants.

Multi-Armed Bandits

