no code implementations • 5 Mar 2024 • Amaury Gouverneur, Borja Rodríguez-Gálvez, Tobias J. Oechtering, Mikael Skoglund
This paper studies the Bayesian regret of a variant of the Thompson-Sampling algorithm for bandit problems.
no code implementations • 26 Apr 2023 • Amaury Gouverneur, Borja Rodríguez-Gálvez, Tobias J. Oechtering, Mikael Skoglund
In this work, we study the performance of the Thompson Sampling algorithm for Contextual Bandit problems based on the framework introduced by Neu et al. and their concept of lifted information ratio.
no code implementations • 18 Jul 2022 • Amaury Gouverneur, Borja Rodríguez-Gálvez, Tobias J. Oechtering, Mikael Skoglund
Building on the framework introduced by Xu and Raginksy [1] for supervised learning problems, we study the best achievable performance for model-based Bayesian reinforcement learning problems.
1 code implementation • 13 Apr 2022 • Antoine Aspeel, Amaury Gouverneur, Raphaël M. Jungers, Benoit Macq
We prove that in terms of expected mean square error, the stochastic program filter outperforms the online filter, which itself outperforms the offline filter.