Search Results for author: Thodoris Lykouris

Found 12 papers, 1 papers with code

Bayesian decision-making under misspecified priors with applications to meta-learning

no code implementations NeurIPS 2021 Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu, Thodoris Lykouris, Miroslav Dudík, Robert E. Schapire

We prove that the expected reward accrued by Thompson sampling (TS) with a misspecified prior differs by at most $\tilde{\mathcal{O}}(H^2 \epsilon)$ from TS with a well specified prior, where $\epsilon$ is the total-variation distance between priors and $H$ is the learning horizon.

Decision Making Meta-Learning +1

Bandits with adversarial scaling

no code implementations ICML 2020 Thodoris Lykouris, Vahab Mirrokni, Renato Paes Leme

We study "adversarial scaling", a multi-armed bandit model where rewards have a stochastic and an adversarial component.

Contextual Search in the Presence of Irrational Agents

no code implementations26 Feb 2020 Akshay Krishnamurthy, Thodoris Lykouris, Chara Podimata, Robert Schapire

We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying behavioral model.

Learning Theory

Corruption-robust exploration in episodic reinforcement learning

no code implementations20 Nov 2019 Thodoris Lykouris, Max Simchowitz, Aleksandrs Slivkins, Wen Sun

We initiate the study of multi-stage episodic reinforcement learning under adversarial corruptions in both the rewards and the transition probabilities of the underlying system extending recent results for the special case of stochastic bandits.

Multi-Armed Bandits

Advancing subgroup fairness via sleeping experts

no code implementations18 Sep 2019 Avrim Blum, Thodoris Lykouris

We demonstrate that the task of satisfying this guarantee for multiple overlapping groups is not straightforward and show that for the simple objective of unweighted average of false negative and false positive rate, satisfying this for overlapping populations can be statistically impossible even when we are provided predictors that perform well separately on each subgroup.

Fairness

Feedback graph regret bounds for Thompson Sampling and UCB

no code implementations23 May 2019 Thodoris Lykouris, Eva Tardos, Drishti Wali

We study the stochastic multi-armed bandit problem with the graph-based feedback structure introduced by Mannor and Shamir.

On preserving non-discrimination when combining expert advice

no code implementations NeurIPS 2018 Avrim Blum, Suriya Gunasekar, Thodoris Lykouris, Nathan Srebro

We study the interplay between sequential decision making and avoiding discrimination against protected groups, when examples arrive online and do not follow distributional assumptions.

Decision Making

Stochastic bandits robust to adversarial corruptions

no code implementations25 Mar 2018 Thodoris Lykouris, Vahab Mirrokni, Renato Paes Leme

We introduce a new model of stochastic bandits with adversarial corruptions which aims to capture settings where most of the input follows a stochastic pattern but some fraction of it can be adversarially changed to trick the algorithm, e. g., click fraud, fake reviews and email spam.

Competitive caching with machine learned advice

no code implementations ICML 2018 Thodoris Lykouris, Sergei Vassilvitskii

Traditional online algorithms encapsulate decision making under uncertainty, and give ways to hedge against all possible future events, while guaranteeing a nearly optimal solution as compared to an offline optimum.

Decision Making Decision Making Under Uncertainty

Small-loss bounds for online learning with partial information

no code implementations9 Nov 2017 Thodoris Lykouris, Karthik Sridharan, Eva Tardos

We develop a black-box approach for such problems where the learner observes as feedback only losses of a subset of the actions that includes the selected action.

Multi-Armed Bandits

Learning in Games: Robustness of Fast Convergence

no code implementations NeurIPS 2016 Dylan J. Foster, Zhiyuan Li, Thodoris Lykouris, Karthik Sridharan, Eva Tardos

We show that learning algorithms satisfying a $\textit{low approximate regret}$ property experience fast convergence to approximate optimality in a large class of repeated games.

Cannot find the paper you are looking for? You can Submit a new open access paper.