1 code implementation • 18 Dec 2023 • Anne-Marie George, Christos Dimitrakakis
Furthermore, if all agents' preferences are strict rankings over the alternatives, we provide means to prune confidence intervals and thereby guide a more efficient elicitation.
no code implementations • 27 Nov 2023 • Thomas Kleine Buening, Aadirupa Saha, Christos Dimitrakakis, Haifeng Xu
We study a strategic variant of the multi-armed bandit problem, which we coin the strategic click-bandit.
1 code implementation • 21 Feb 2023 • Thomas Kleine Buening, Christos Dimitrakakis, Hannes Eriksson, Divya Grover, Emilio Jorge
While the Bayesian decision-theoretic framework offers an elegant solution to the problem of decision making under uncertainty, one question is how to appropriately select the prior distribution.
no code implementations • 18 Feb 2023 • Hannes Eriksson, Debabrota Basu, Tommy Tram, Mina Alibeigi, Christos Dimitrakakis
Then, we propose a generic two-stage algorithm, MLEMTRL, to address the MTRL problem in discrete and continuous settings.
no code implementations • 26 Oct 2022 • Thomas Kleine Buening, Christos Dimitrakakis
The task of learning a reward function from expert demonstrations suffers from high sample complexity as well as inherent limitations to what can be learned from demonstrations in a given environment.
no code implementations • 18 Mar 2022 • Hannes Eriksson, Debabrota Basu, Mina Alibeigi, Christos Dimitrakakis
In existing literature, the risk in stochastic games has been studied in terms of the inherent uncertainty evoked by the variability of transitions and actions.
Multi-agent Reinforcement Learning reinforcement-learning +1
no code implementations • 8 Nov 2021 • Thomas Kleine Buening, Anne-Marie George, Christos Dimitrakakis
How should the first agent act in order to learn the joint reward function as quickly as possible and so that the joint policy is as close to optimal as possible?
no code implementations • 23 Apr 2021 • Hannes Eriksson, Christos Dimitrakakis, Lars Carlsson
We study the problem of performing automated experiment design for drug screening through Bayesian inference and optimisation.
1 code implementation • 15 Apr 2021 • Divya Grover, Christos Dimitrakakis
We instead propose an adaptive belief discretization scheme, and give its associated planning error.
no code implementations • 23 Feb 2021 • Thomas Kleine Buening, Meirav Segal, Debabrota Basu, Christos Dimitrakakis, Anne-Marie George
Typically, merit is defined with respect to some intrinsic measure of worth.
no code implementations • 22 Feb 2021 • Hannes Eriksson, Debabrota Basu, Mina Alibeigi, Christos Dimitrakakis
In this paper, we consider risk-sensitive sequential decision-making in Reinforcement Learning (RL).
no code implementations • NeurIPS Workshop ICBINB 2020 • Hannes Eriksson, Emilio Jorge, Christos Dimitrakakis, Debabrota Basu, Divya Grover
Bayesian reinforcement learning (BRL) offers a decision-theoretic solution for reinforcement learning.
no code implementations • 20 Jun 2019 • Aristide Tossou, Christos Dimitrakakis, Debabrota Basu
We derive the first polynomial time Bayesian algorithm, BUCRL{} that achieves up to logarithm factors, a regret (i. e the difference between the accumulated rewards of the optimal policy and our algorithm) of the optimal order $\tilde{\mathcal{O}}(\sqrt{DSAT})$.
no code implementations • 14 Jun 2019 • Hannes Eriksson, Christos Dimitrakakis
The risk-averse behavior is then compared with the behavior of the optimal risk-neutral policy in environments with epistemic risk.
no code implementations • 4 Jun 2019 • Aristide Tossou, Christos Dimitrakakis, Jaroslaw Rzepecki, Katja Hofmann
We study two-player general sum repeated finite games where the rewards of each player are generated from an unknown distribution.
no code implementations • 29 May 2019 • Debabrota Basu, Christos Dimitrakakis, Aristide Tossou
We derive and contrast lower bounds on the regret of bandit algorithms satisfying these definitions.
no code implementations • 27 May 2019 • Aristide Tossou, Debabrota Basu, Christos Dimitrakakis
We study model-based reinforcement learning in an unknown finite communicating Markov decision process.
Model-based Reinforcement Learning reinforcement-learning +1
no code implementations • 6 Apr 2019 • Nikolaos Tziortziotis, Christos Dimitrakakis, Michalis Vazirgiannis
We introduce Bayesian least-squares policy iteration (BLSPI), an off-policy, model-free, policy iteration algorithm that uses the Bayesian least-squares temporal-difference (BLSTD) learning algorithm to evaluate policies.
1 code implementation • 7 Feb 2019 • Divya Grover, Debabrota Basu, Christos Dimitrakakis
We address the problem of Bayesian reinforcement learning using efficient model-based online planning.
no code implementations • 24 Jun 2018 • Aristide C. Y. Tossou, Christos Dimitrakakis
This compares favorably to the previous result for Thompson Sampling in the literature ((Mishra & Thakurta, 2015)) which adds a term of $\mathcal{O}(\frac{K \ln^3 T}{\epsilon^2})$ to the regret in order to achieve the same privacy level.
no code implementations • NeurIPS 2017 • Christos Dimitrakakis, David C. Parkes, Goran Radanovic, Paul Tylkin
We consider a two-player sequential game in which agents have the same reward function but may disagree on the transition probabilities of an underlying Markovian model of the world.
no code implementations • 30 Jul 2017 • Philip Ekman, Sebastian Bellevik, Christos Dimitrakakis, Aristide Tossou
One specific such problem involves matching a set of workers to a set of tasks.
no code implementations • 6 Jul 2017 • Yang Liu, Goran Radanovic, Christos Dimitrakakis, Debmalya Mandal, David C. Parkes
In addition, we define the {\em fairness regret}, which corresponds to the degree to which an algorithm is not calibrated, where perfect calibration requires that the probability of selecting an arm is equal to the probability with which the arm has the best quality realization.
no code implementations • 31 May 2017 • Christos Dimitrakakis, Yang Liu, David Parkes, Goran Radanovic
We consider the problem of how decision making can be fair when the underlying probabilistic model of the world is not known with certainty.
no code implementations • 16 Jan 2017 • Aristide C. Y. Tossou, Christos Dimitrakakis, Devdatt Dubhashi
We present a novel extension of Thompson Sampling for stochastic sequential decision problems with graph feedback, even when the graph structure itself is unknown and/or changing.
no code implementations • 16 Jan 2017 • Aristide C. Y. Tossou, Christos Dimitrakakis
This allows us to reach $\mathcal{O}{(\sqrt{\ln T})}$-DP, with a regret of $\mathcal{O}{(T^{2/3})}$ that holds against an adaptive adversary, an improvement from the best known of $\mathcal{O}{(T^{3/4})}$.
no code implementations • 22 Dec 2015 • Zuhe Zhang, Benjamin Rubinstein, Christos Dimitrakakis
We study how to communicate findings of Bayesian inference to third parties, while preserving the strong guarantee of differential privacy.
no code implementations • 27 Nov 2015 • Aristide Tossou, Christos Dimitrakakis
This is a significant improvement over previous results, which only achieve poly-log regret $O(\epsilon^{-2} \log^{2} T)$, because of our use of a novel interval-based mechanism.
no code implementations • 10 Dec 2014 • Emmanouil G. Androulakis, Christos Dimitrakakis
Bayesian methods suffer from the problem of how to specify prior beliefs.
no code implementations • 9 Aug 2014 • Aristide Tossou, Christos Dimitrakakis
To do so, we extend previous probabilistic approaches for inverse reinforcement learning in known MDPs to the case of unknown dynamics or opponents.
no code implementations • 14 Jul 2013 • Aristide C. Y. Tossou, Christos Dimitrakakis
To do so, we extend previous probabilistic approaches for inverse reinforcement learning in known MDPs to the case of unknown dynamics or opponents.
no code implementations • 5 Jun 2013 • Christos Dimitrakakis, Blaine Nelson, and Zuhe Zhang, Aikaterini Mitrokotsa, Benjamin Rubinstein
All our general results hold for arbitrary database metrics, including those for the common definition of differential privacy.
no code implementations • 8 May 2013 • Nikolaos Tziortziotis, Christos Dimitrakakis, Konstantinos Blekas
This paper proposes an online tree-based Bayesian approach for reinforcement learning.
no code implementations • 27 Mar 2013 • Christos Dimitrakakis, Nikolaos Tziortziotis
This paper introduces a simple, general framework for likelihood-free Bayesian reinforcement learning, through Approximate Bayesian Computation (ABC).
no code implementations • 4 Mar 2013 • Florent Garcin, Christos Dimitrakakis, Boi Faltings
The profusion of online news articles makes it difficult to find interesting articles, a problem that can be assuaged by using a recommender system to bring the most relevant news stories to readers.