no code implementations • 27 Nov 2023 • Thomas Kleine Buening, Aadirupa Saha, Christos Dimitrakakis, Haifeng Xu
We study a strategic variant of the multi-armed bandit problem, which we coin the strategic click-bandit.
1 code implementation • 21 Feb 2023 • Thomas Kleine Buening, Christos Dimitrakakis, Hannes Eriksson, Divya Grover, Emilio Jorge
While the Bayesian decision-theoretic framework offers an elegant solution to the problem of decision making under uncertainty, one question is how to appropriately select the prior distribution.
no code implementations • 26 Oct 2022 • Thomas Kleine Buening, Christos Dimitrakakis
The task of learning a reward function from expert demonstrations suffers from high sample complexity as well as inherent limitations to what can be learned from demonstrations in a given environment.
no code implementations • 25 Oct 2022 • Thomas Kleine Buening, Aadirupa Saha
We study the problem of non-stationary dueling bandits and provide the first adaptive dynamic regret algorithm for this problem.
no code implementations • 8 Nov 2021 • Thomas Kleine Buening, Anne-Marie George, Christos Dimitrakakis
How should the first agent act in order to learn the joint reward function as quickly as possible and so that the joint policy is as close to optimal as possible?
no code implementations • 23 Feb 2021 • Thomas Kleine Buening, Meirav Segal, Debabrota Basu, Christos Dimitrakakis, Anne-Marie George
Typically, merit is defined with respect to some intrinsic measure of worth.