Search Results for author: Danil Provodin

Found 5 papers, 3 papers with code

Provably Efficient Exploration in Constrained Reinforcement Learning:Posterior Sampling Is All You Need

no code implementations27 Sep 2023 Danil Provodin, Pratik Gajane, Mykola Pechenizkiy, Maurits Kaptein

We present a new algorithm based on posterior sampling for learning in constrained Markov decision processes (CMDP) in the infinite-horizon undiscounted setting.

Efficient Exploration

Learning Optimal Bidding Strategy: Case Study in E-Commerce Advertising

no code implementations31 Mar 2023 Danil Provodin, Jérémie Joudioux, Eduard Duryev

Although the bandits framework is a classical and well-suited approach for optimal bidding strategies in sponsored search auctions, industrial attempts are rarely documented.

The Impact of Batch Learning in Stochastic Linear Bandits

1 code implementation14 Feb 2022 Danil Provodin, Pratik Gajane, Mykola Pechenizkiy, Maurits Kaptein

Our main theoretical results show that the impact of batch learning is a multiplicative factor of batch size relative to the regret of online behavior.

Cannot find the paper you are looking for? You can Submit a new open access paper.