Search Results for author: Thodoris Lykouris

Found 15 papers, 1 papers with code

Learning in Games: Robustness of Fast Convergence

no code implementations NeurIPS 2016 Dylan J. Foster, Zhiyuan Li, Thodoris Lykouris, Karthik Sridharan, Eva Tardos

We show that learning algorithms satisfying a $\textit{low approximate regret}$ property experience fast convergence to approximate optimality in a large class of repeated games.

Small-loss bounds for online learning with partial information

no code implementations9 Nov 2017 Thodoris Lykouris, Karthik Sridharan, Eva Tardos

We develop a black-box approach for such problems where the learner observes as feedback only losses of a subset of the actions that includes the selected action.

Multi-Armed Bandits

Competitive caching with machine learned advice

no code implementations ICML 2018 Thodoris Lykouris, Sergei Vassilvitskii

Traditional online algorithms encapsulate decision making under uncertainty, and give ways to hedge against all possible future events, while guaranteeing a nearly optimal solution as compared to an offline optimum.

Decision Making Decision Making Under Uncertainty

Stochastic bandits robust to adversarial corruptions

no code implementations25 Mar 2018 Thodoris Lykouris, Vahab Mirrokni, Renato Paes Leme

We introduce a new model of stochastic bandits with adversarial corruptions which aims to capture settings where most of the input follows a stochastic pattern but some fraction of it can be adversarially changed to trick the algorithm, e. g., click fraud, fake reviews and email spam.

On preserving non-discrimination when combining expert advice

no code implementations NeurIPS 2018 Avrim Blum, Suriya Gunasekar, Thodoris Lykouris, Nathan Srebro

We study the interplay between sequential decision making and avoiding discrimination against protected groups, when examples arrive online and do not follow distributional assumptions.

Decision Making

Feedback graph regret bounds for Thompson Sampling and UCB

no code implementations23 May 2019 Thodoris Lykouris, Eva Tardos, Drishti Wali

We study the stochastic multi-armed bandit problem with the graph-based feedback structure introduced by Mannor and Shamir.

Thompson Sampling

Advancing subgroup fairness via sleeping experts

no code implementations18 Sep 2019 Avrim Blum, Thodoris Lykouris

We demonstrate that the task of satisfying this guarantee for multiple overlapping groups is not straightforward and show that for the simple objective of unweighted average of false negative and false positive rate, satisfying this for overlapping populations can be statistically impossible even when we are provided predictors that perform well separately on each subgroup.

Fairness

Corruption-robust exploration in episodic reinforcement learning

no code implementations20 Nov 2019 Thodoris Lykouris, Max Simchowitz, Aleksandrs Slivkins, Wen Sun

We initiate the study of multi-stage episodic reinforcement learning under adversarial corruptions in both the rewards and the transition probabilities of the underlying system extending recent results for the special case of stochastic bandits.

Multi-Armed Bandits reinforcement-learning +1

Contextual Search in the Presence of Adversarial Corruptions

no code implementations26 Feb 2020 Akshay Krishnamurthy, Thodoris Lykouris, Chara Podimata, Robert Schapire

We initiate the study of contextual search when some of the agents can behave in ways inconsistent with the underlying response model.

Learning Theory

Bandits with adversarial scaling

no code implementations ICML 2020 Thodoris Lykouris, Vahab Mirrokni, Renato Paes Leme

We study "adversarial scaling", a multi-armed bandit model where rewards have a stochastic and an adversarial component.

Bayesian decision-making under misspecified priors with applications to meta-learning

no code implementations NeurIPS 2021 Max Simchowitz, Christopher Tosh, Akshay Krishnamurthy, Daniel Hsu, Thodoris Lykouris, Miroslav Dudík, Robert E. Schapire

We prove that the expected reward accrued by Thompson sampling (TS) with a misspecified prior differs by at most $\tilde{\mathcal{O}}(H^2 \epsilon)$ from TS with a well specified prior, where $\epsilon$ is the total-variation distance between priors and $H$ is the learning horizon.

Decision Making Meta-Learning +2

Efficient decentralized multi-agent learning in asymmetric bipartite queueing systems

no code implementations5 Jun 2022 Daniel Freund, Thodoris Lykouris, Wentao Weng

We study decentralized multi-agent learning in bipartite queueing systems, a standard model for service systems.

Learning in Stackelberg Games with Non-myopic Agents

no code implementations19 Aug 2022 Nika Haghtalab, Thodoris Lykouris, Sloan Nietert, Alex Wei

Although learning in Stackelberg games is well-understood when the agent is myopic, non-myopic agents pose additional complications.

Learning to Defer in Content Moderation: The Human-AI Interplay

no code implementations19 Feb 2024 Thodoris Lykouris, Wentao Weng

The classical learning-theoretic way to capture this human-AI interplay is via the framework of learning to defer, where the algorithm has the option to defer a classification task to humans for a fixed cost and immediately receive feedback.

Scheduling

Cannot find the paper you are looking for? You can Submit a new open access paper.