Search Results for author: Denis Denisov

Found 1 papers, 0 papers with code

Regret Analysis of a Markov Policy Gradient Algorithm for Multi-arm Bandits

no code implementations20 Jul 2020 Denis Denisov, Neil Walton

We consider a policy gradient algorithm applied to a finite-arm bandit problem with Bernoulli rewards.

Cannot find the paper you are looking for? You can Submit a new open access paper.