Search Results for author: Aristide Tossou

Found 7 papers, 0 papers with code

Near-optimal Bayesian Solution For Unknown Discrete Markov Decision Process

no code implementations • 20 Jun 2019 • Aristide Tossou, Christos Dimitrakakis, Debabrota Basu

We derive the first polynomial time Bayesian algorithm, BUCRL{} that achieves up to logarithm factors, a regret (i. e the difference between the accumulated rewards of the optimal policy and our algorithm) of the optimal order $\tilde{\mathcal{O}}(\sqrt{DSAT})$.

Paper
Add Code

Near-Optimal Online Egalitarian learning in General Sum Repeated Matrix Games

no code implementations • 4 Jun 2019 • Aristide Tossou, Christos Dimitrakakis, Jaroslaw Rzepecki, Katja Hofmann

We study two-player general sum repeated finite games where the rewards of each player are generated from an unknown distribution.

Paper
Add Code

Differential Privacy for Multi-armed Bandits: What Is It and What Is Its Cost?

no code implementations • 29 May 2019 • Debabrota Basu, Christos Dimitrakakis, Aristide Tossou

We derive and contrast lower bounds on the regret of bandit algorithms satisfying these definitions.

Multi-Armed Bandits

Paper
Add Code

Near-optimal Optimistic Reinforcement Learning using Empirical Bernstein Inequalities

no code implementations • 27 May 2019 • Aristide Tossou, Debabrota Basu, Christos Dimitrakakis

We study model-based reinforcement learning in an unknown finite communicating Markov decision process.

Model-based Reinforcement Learning reinforcement-learning +1

Paper
Add Code

Learning to Match

no code implementations • 30 Jul 2017 • Philip Ekman, Sebastian Bellevik, Christos Dimitrakakis, Aristide Tossou

One specific such problem involves matching a set of workers to a set of tasks.

Paper
Add Code

Algorithms for Differentially Private Multi-Armed Bandits

no code implementations • 27 Nov 2015 • Aristide Tossou, Christos Dimitrakakis

This is a significant improvement over previous results, which only achieve poly-log regret $O(\epsilon^{-2} \log^{2} T)$, because of our use of a novel interval-based mechanism.

Multi-Armed Bandits

Paper
Add Code

Probabilistic inverse reinforcement learning in unknown environments

no code implementations • 9 Aug 2014 • Aristide Tossou, Christos Dimitrakakis

To do so, we extend previous probabilistic approaches for inverse reinforcement learning in known MDPs to the case of unknown dynamics or opponents.

Bayesian Inference reinforcement-learning +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.