Search Results for author: Sanjay Shakkottai

Found 50 papers, 7 papers with code

Collaborative Multi-Agent Heterogeneous Multi-Armed Bandits

no code implementations30 May 2023 Ronshee Chawla, Daniel Vial, Sanjay Shakkottai, R. Srikant

The study of collaborative multi-agent bandits has attracted significant attention recently.

Multi-Armed Bandits

InfoNCE Loss Provably Learns Cluster-Preserving Representations

no code implementations15 Feb 2023 Advait Parulekar, Liam Collins, Karthikeyan Shanmugam, Aryan Mokhtari, Sanjay Shakkottai

The goal of contrasting learning is to learn a representation that preserves underlying clusters by keeping samples with similar content, e. g. the ``dogness'' of a dog, close to each other in the space generated by the representation.

Beyond Uniform Smoothness: A Stopped Analysis of Adaptive SGD

no code implementations13 Feb 2023 Matthew Faw, Litu Rout, Constantine Caramanis, Sanjay Shakkottai

Despite the richness, an emerging line of works achieves the $\widetilde{\mathcal{O}}(\frac{1}{\sqrt{T}})$ rate of convergence when the noise of the stochastic gradients is deterministically and uniformly bounded.

PAC Generalization via Invariant Representations

no code implementations30 May 2022 Advait Parulekar, Karthikeyan Shanmugam, Sanjay Shakkottai

These are representations of the covariates such that the best model on top of the representation is invariant across training environments.

Out-of-Distribution Generalization PAC learning

Non-Stationary Bandits under Recharging Payoffs: Improved Planning with Sublinear Regret

no code implementations29 May 2022 Orestis Papadigenopoulos, Constantine Caramanis, Sanjay Shakkottai

Even assuming prior knowledge of the mean payoff functions, computing an optimal planning in the above model is NP-hard, while the state-of-the-art is a $1/4$-approximation algorithm for the case where at most one arm can be played per round.


FedAvg with Fine Tuning: Local Updates Lead to Representation Learning

no code implementations27 May 2022 Liam Collins, Hamed Hassani, Aryan Mokhtari, Sanjay Shakkottai

We show that the reason behind generalizability of the FedAvg's output is its power in learning the common data representation among the clients' tasks, by leveraging the diversity among client data distributions via local updates.

Federated Learning Image Classification +1

Minimax Regret for Cascading Bandits

no code implementations23 Mar 2022 Daniel Vial, Sujay Sanghavi, Sanjay Shakkottai, R. Srikant

Cascading bandits is a natural and popular model that frames the task of learning to rank from Bernoulli click feedback in a bandit setting.


Robust Multi-Agent Bandits Over Undirected Graphs

no code implementations28 Feb 2022 Daniel Vial, Sanjay Shakkottai, R. Srikant

Thus, we generalize existing regret bounds beyond the complete graph (where $d_{\text{mal}}(i) = m$), and show the effect of malicious agents is entirely local (in the sense that only the $d_{\text{mal}}(i)$ malicious agents directly connected to $i$ affect its long-term regret).

The Power of Adaptivity in SGD: Self-Tuning Step Sizes with Unbounded Gradients and Affine Variance

no code implementations11 Feb 2022 Matthew Faw, Isidoros Tziotis, Constantine Caramanis, Aryan Mokhtari, Sanjay Shakkottai, Rachel Ward

We study convergence rates of AdaGrad-Norm as an exemplar of adaptive stochastic gradient methods (SGD), where the step sizes change based on observed stochastic gradients, for minimizing non-convex, smooth objectives.

MAML and ANIL Provably Learn Representations

no code implementations7 Feb 2022 Liam Collins, Aryan Mokhtari, Sewoong Oh, Sanjay Shakkottai

Recent empirical evidence has driven conventional wisdom to believe that gradient-based meta-learning (GBML) methods perform well at few-shot learning because they learn an expressive data representation that is shared across tasks.

Few-Shot Learning Representation Learning

Improved Algorithms for Misspecified Linear Markov Decision Processes

no code implementations12 Sep 2021 Daniel Vial, Advait Parulekar, Sanjay Shakkottai, R. Srikant

(P1) Its regret after $K$ episodes scales as $K \max \{ \varepsilon_{\text{mis}}, \varepsilon_{\text{tol}} \}$, where $\varepsilon_{\text{mis}}$ is the degree of misspecification and $\varepsilon_{\text{tol}}$ is a user-specified error tolerance.

Multi-Armed Bandits

Episodic Bandits with Stochastic Experts

no code implementations7 Jul 2021 Nihal Sharma, Soumya Basu, Karthikeyan Shanmugam, Sanjay Shakkottai

The agent interacts with the environment over episodes, with each episode having different context distributions; this results in the `best expert' changing across episodes.

Finite-Sample Analysis of Off-Policy TD-Learning via Generalized Bellman Operators

no code implementations NeurIPS 2021 Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, Karthikeyan Shanmugam

Our key step is to show that the generalized Bellman operator is simultaneously a contraction mapping with respect to a weighted $\ell_p$-norm for each $p$ in $[1,\infty)$, with a common contraction factor.

Does Optimal Source Task Performance Imply Optimal Pre-training for a Target Task?

1 code implementation21 Jun 2021 Steven Gutstein, Brent Lance, Sanjay Shakkottai

Fine-tuning of pre-trained deep nets is commonly used to improve accuracies and training times for neural nets.

Job Dispatching Policies for Queueing Systems with Unknown Service Rates

no code implementations8 Jun 2021 Tuhinangshu Choudhury, Gauri Joshi, Weina Wang, Sanjay Shakkottai

In multi-server queueing systems where there is no central queue holding all incoming jobs, job dispatching policies are used to assign incoming jobs to the queue at one of the servers.

Combinatorial Blocking Bandits with Stochastic Delays

no code implementations22 May 2021 Alexia Atsidakou, Orestis Papadigenopoulos, Soumya Basu, Constantine Caramanis, Sanjay Shakkottai

Recent work has considered natural variations of the multi-armed bandit problem, where the reward distribution of each arm is a special function of the time passed since its last pulling.


Regret Bounds for Stochastic Shortest Path Problems with Linear Function Approximation

no code implementations4 May 2021 Daniel Vial, Advait Parulekar, Sanjay Shakkottai, R. Srikant

We propose an algorithm that uses linear function approximation (LFA) for stochastic shortest path (SSP).

Linear Bandit Algorithms with Sublinear Time Complexity

no code implementations3 Mar 2021 Shuo Yang, Tongzheng Ren, Sanjay Shakkottai, Eric Price, Inderjit S. Dhillon, Sujay Sanghavi

For sufficiently large $K$, our algorithms have sublinear per-step complexity and $\tilde O(\sqrt{T})$ regret.

Movie Recommendation

Exploiting Shared Representations for Personalized Federated Learning

4 code implementations14 Feb 2021 Liam Collins, Hamed Hassani, Aryan Mokhtari, Sanjay Shakkottai

Based on this intuition, we propose a novel federated learning framework and algorithm for learning a shared data representation across clients and unique local heads for each client.

Meta-Learning Multi-Task Learning +2

A Lyapunov Theory for Finite-Sample Guarantees of Asynchronous Q-Learning and TD-Learning Variants

no code implementations2 Feb 2021 Zaiwei Chen, Siva Theja Maguluri, Sanjay Shakkottai, Karthikeyan Shanmugam

As a by-product, by analyzing the convergence bounds of $n$-step TD and TD$(\lambda)$, we provide theoretical insights into the bias-variance trade-off, i. e., efficiency of bootstrapping in RL.

Q-Learning Reinforcement Learning (RL)

One-bit feedback is sufficient for upper confidence bound policies

no code implementations4 Dec 2020 Daniel Vial, Sanjay Shakkottai, R. Srikant

We consider a variant of the traditional multi-armed bandit problem in which each arm is only able to provide one-bit feedback during each pull based on its past history of rewards.

Stochastic Linear Bandits with Protected Subspace

no code implementations2 Nov 2020 Advait Parulekar, Soumya Basu, Aditya Gopalan, Karthikeyan Shanmugam, Sanjay Shakkottai

We study a variant of the stochastic linear bandit problem wherein we optimize a linear objective function but rewards are accrued only orthogonal to an unknown subspace (which we interpret as a \textit{protected space}) given only zero-order stochastic oracle access to both the objective itself and protected subspace.

How Does the Task Landscape Affect MAML Performance?

no code implementations27 Oct 2020 Liam Collins, Aryan Mokhtari, Sanjay Shakkottai

Model-Agnostic Meta-Learning (MAML) has become increasingly popular for training models that can quickly adapt to new tasks via one or few stochastic gradient descent steps.

Few-Shot Image Classification Meta-Learning

Adaptive KL-UCB based Bandit Algorithms for Markovian and i.i.d. Settings

no code implementations14 Sep 2020 Arghyadip Roy, Sanjay Shakkottai, R. Srikant

rewards are a special case of Markov rewards and it is difficult to design an algorithm that works well independent of whether the underlying model is truly Markovian or i. i. d.

Robust Multi-Agent Multi-Armed Bandits

no code implementations7 Jul 2020 Daniel Vial, Sanjay Shakkottai, R. Srikant

Recent works have shown that agents facing independent instances of a stochastic $K$-armed bandit can collaborate to decrease regret.

Distributed Computing Multi-Armed Bandits +1

Multi-Agent Low-Dimensional Linear Bandits

no code implementations2 Jul 2020 Ronshee Chawla, Abishek Sankararaman, Sanjay Shakkottai

We study a multi-agent stochastic linear bandit with side information, parameterized by an unknown vector $\theta^* \in \mathbb{R}^d$.

Contextual Blocking Bandits

no code implementations6 Mar 2020 Soumya Basu, Orestis Papadigenopoulos, Constantine Caramanis, Sanjay Shakkottai

Assuming knowledge of the context distribution and the mean reward of each arm-context pair, we cast the problem as an online bipartite matching problem, where the right-vertices (contexts) arrive stochastically and the left-vertices (arms) are blocked for a finite number of rounds each time they are matched.

Blocking Novel Concepts +1

On Under-exploration in Bandits with Mean Bounds from Confounded Data

no code implementations19 Feb 2020 Nihal Sharma, Soumya Basu, Karthikeyan Shanmugam, Sanjay Shakkottai

We study a variant of the multi-armed bandit problem where side information in the form of bounds on the mean of each arm is provided.

The Gossiping Insert-Eliminate Algorithm for Multi-Agent Bandits

no code implementations15 Jan 2020 Ronshee Chawla, Abishek Sankararaman, Ayalvadi Ganesh, Sanjay Shakkottai

Agents use the communication medium to recommend only arm-IDs (not samples), and thus update the set of arms from which they play.

Verification and Parameter Synthesis for Stochastic Systems using Optimistic Optimization

no code implementations4 Nov 2019 Negin Musavi, Dawei Sun, Sayan Mitra, Geir Dullerud, Sanjay Shakkottai

As a consequence, we obtain theoretical regret bounds on sample efficiency of our solution that depends on key problem parameters like smoothness, near-optimality dimension, and batch size.

Social Learning in Multi Agent Multi Armed Bandits

no code implementations4 Oct 2019 Abishek Sankararaman, Ayalvadi Ganesh, Sanjay Shakkottai

Our setting consists of a large number of agents $n$ that collaboratively and simultaneously solve the same instance of $K$ armed MAB to minimize the average cumulative regret over all agents.

Multi-Armed Bandits

Blocking Bandits

no code implementations NeurIPS 2019 Soumya Basu, Rajat Sen, Sujay Sanghavi, Sanjay Shakkottai

We show that with prior knowledge of the rewards and delays of all the arms, the problem of optimizing cumulative reward does not admit any pseudo-polynomial time algorithm (in the number of arms) unless randomized exponential time hypothesis is false, by mapping to the PINWHEEL scheduling problem.

Blocking Product Recommendation +1

Mix and Match: An Optimistic Tree-Search Approach for Learning Models from Mixture Distributions

1 code implementation NeurIPS 2020 Matthew Faw, Rajat Sen, Karthikeyan Shanmugam, Constantine Caramanis, Sanjay Shakkottai

We consider a covariate shift problem where one has access to several different training datasets for the same learning problem and a small validation set which possibly differs from all the individual training distributions.

Noisy Blackbox Optimization with Multi-Fidelity Queries: A Tree Search Approach

1 code implementation24 Oct 2018 Rajat Sen, Kirthevasan Kandasamy, Sanjay Shakkottai

We study the problem of black-box optimization of a noisy function in the presence of low-cost approximations or fidelities, which is motivated by problems like hyper-parameter tuning.

Applications of Common Entropy for Causal Inference

no code implementations NeurIPS 2020 Murat Kocaoglu, Sanjay Shakkottai, Alexandros G. Dimakis, Constantine Caramanis, Sriram Vishwanath

We study the problem of discovering the simplest latent variable that can make two observed discrete variables conditionally independent.

Causal Inference

Multi-Fidelity Black-Box Optimization with Hierarchical Partitions

no code implementations ICML 2018 Rajat Sen, Kirthevasan Kandasamy, Sanjay Shakkottai

Motivated by settings such as hyper-parameter tuning and physical simulations, we consider the problem of black-box optimization of a function.

Physical Simulations

Importance Weighted Generative Networks

no code implementations7 Jun 2018 Maurice Diesendruck, Ethan R. Elenberg, Rajat Sen, Guy W. Cole, Sanjay Shakkottai, Sinead A. Williamson

Deep generative networks can simulate from a complex target distribution, by minimizing a loss with respect to samples from that distribution.

Selection bias

Contextual Bandits with Stochastic Experts

1 code implementation23 Feb 2018 Rajat Sen, Karthikeyan Shanmugam, Nihal Sharma, Sanjay Shakkottai

We consider the problem of contextual bandits with stochastic experts, which is a variation of the traditional stochastic contextual bandit with experts problem.

Multi-Armed Bandits

Identifying Best Interventions through Online Importance Sampling

no code implementations ICML 2017 Rajat Sen, Karthikeyan Shanmugam, Alexandros G. Dimakis, Sanjay Shakkottai

Motivated by applications in computational advertising and systems biology, we consider the problem of identifying the best out of several possible soft interventions at a source node $V$ in an acyclic causal directed graph, to maximize the expected value of a target node $Y$ (located downstream of $V$).

Regret of Queueing Bandits

no code implementations NeurIPS 2016 Subhashini Krishnasamy, Rajat Sen, Ramesh Johari, Sanjay Shakkottai

A naive view of this problem would suggest that queue-regret should grow logarithmically: since queue-regret cannot be larger than classical regret, results for the standard MAB problem give algorithms that ensure queue-regret increases no more than logarithmically in time.

The Search Problem in Mixture Models

no code implementations4 Oct 2016 Avik Ray, Joe Neeman, Sujay Sanghavi, Sanjay Shakkottai

We consider the task of learning the parameters of a {\em single} component of a mixture model, for the case when we are given {\em side information} about that component, we call this the "search problem" in mixture models.

Clustering Topic Models

Contextual Bandits with Latent Confounders: An NMF Approach

no code implementations1 Jun 2016 Rajat Sen, Karthikeyan Shanmugam, Murat Kocaoglu, Alexandros G. Dimakis, Sanjay Shakkottai

Our algorithm achieves a regret of $\mathcal{O}\left(L\mathrm{poly}(m, \log K) \log T \right)$ at time $T$, as compared to $\mathcal{O}(LK\log T)$ for conventional contextual bandits, assuming a constant gap between the best arm and the rest for each context.

Matrix Completion Multi-Armed Bandits

Online Collaborative-Filtering on Graphs

no code implementations7 Nov 2014 Siddhartha Banerjee, Sujay Sanghavi, Sanjay Shakkottai

We consider this problem under a simple natural model, wherein the number of items and the number of item-views are of the same order, and an `access-graph' constrains which user is allowed to see which item.

Collaborative Filtering Recommendation Systems

Localized epidemic detection in networks with overwhelming noise

no code implementations6 Feb 2014 Eli A. Meirom, Chris Milling, Constantine Caramanis, Shie Mannor, Ariel Orda, Sanjay Shakkottai

Our algorithm requires only local-neighbor knowledge of this graph, and in a broad array of settings that we describe, succeeds even when false negatives and false positives make up an overwhelming fraction of the data available.

Cannot find the paper you are looking for? You can Submit a new open access paper.