no code implementations • 30 Sep 2023 • Alexander Galozy, Sadi Alawadi, Victor Kebande, Sławomir Nowaczyk
This paper investigates the issue of privacy in a learning scenario where users share knowledge for a recommendation task.
no code implementations • 8 Jul 2022 • Alexander Galozy, Slawomir Nowaczyk
In the latent bandit problem, the learner has access to reward distributions and -- for the non-stationary variant -- transition models of the environment.
1 code implementation • 16 Nov 2020 • Alexander Galozy, Slawomir Nowaczyk, Mattias Ohlsson
We present an algorithm that uses a referee to dynamically combine the policies of a contextual bandit and a multi-armed bandit.