no code implementations • 11 Feb 2025 • Divyansh Jhunjhunwala, Pranay Sharma, Zheng Xu, Gauri Joshi
Several recent works explore the benefits of pre-trained initialization in a federated learning (FL) setting, where the downstream training is performed at the edge clients with heterogeneous data distribution.
no code implementations • 27 Nov 2024 • Shuli Jiang, Qiuyi, Zhang, Gauri Joshi
We study a classical problem in private prediction, the problem of computing an $(m\epsilon, \delta)$-differentially private majority of $K$ $(\epsilon, \Delta)$-differentially private algorithms for $1 \leq m \leq K$ and $1 > \delta \geq \Delta \geq 0$.
no code implementations • 21 Oct 2024 • Baris Askin, Pranay Sharma, Gauri Joshi, Carlee Joe-Wong
We study a federated version of multi-objective optimization (MOO), where a single model is trained to optimize multiple objective functions.
no code implementations • 17 Oct 2024 • Aleksandar Armacki, Shuhua Yu, Pranay Sharma, Gauri Joshi, Dragana Bajovic, Dusan Jakovetic, Soummya Kar
While the rate exponents in state-of-the-art depend on noise moments and vanish as $p \rightarrow 1$, our exponents are constant and strictly better whenever $p < 6/5$ for non-convex and $p < 8/7$ for strongly convex costs.
no code implementations • 13 Oct 2024 • Aayushya Agarwal, Gauri Joshi, Larry Pileggi
Federated learning harnesses the power of distributed optimization to train a unified machine learning model across separate clients.
no code implementations • 2 Oct 2024 • Zhenyu Sun, Ziyang Zhang, Zheng Xu, Gauri Joshi, Pranay Sharma, Ermin Wei
In cross-device federated learning (FL) with millions of mobile clients, only a small subset of clients participate in training in every communication round, and Federated Averaging (FedAvg) is the most popular algorithm in practice.
no code implementations • 2 Sep 2024 • Divyansh Jhunjhunwala, Neharika Jali, Gauri Joshi, Shiqiang Wang
Erasure-coded computing has been successfully used in cloud systems to reduce tail latency caused by factors such as straggling servers and heterogeneous traffic variations.
no code implementations • 1 Jun 2024 • Baris Askin, Pranay Sharma, Carlee Joe-Wong, Gauri Joshi
Much of the existing work in FL focuses on efficiently learning a model for a single task.
1 code implementation • 19 Mar 2024 • Divyansh Jhunjhunwala, Shiqiang Wang, Gauri Joshi
Standard federated learning (FL) algorithms typically require multiple rounds of communication between the server and the clients, which has several drawbacks, including requiring constant network connectivity, repeated investment of computational resources, and susceptibility to privacy attacks.
no code implementations • 8 Feb 2024 • Jiin Woo, Laixi Shi, Gauri Joshi, Yuejie Chi
Our sample complexity analysis reveals that, with appropriately chosen parameters and synchronization schedules, FedLCB-Q achieves linear speedup in terms of the number of agents without requiring high-quality datasets at individual agents, as long as the local datasets collectively cover the state-action space visited by the optimal policy, highlighting the power of collaboration in the federated setting.
no code implementations • 2 Feb 2024 • Neharika Jali, Guannan Qu, Weina Wang, Gauri Joshi
Unlike homogeneous systems, a threshold policy, that routes jobs to the slow server(s) when the queue length exceeds a certain threshold, is known to be optimal for the one-fast-one-slow two-server system.
no code implementations • 12 Jan 2024 • Yae Jee Cho, Luyang Liu, Zheng Xu, Aldi Fahrezi, Gauri Joshi
Foundation models (FMs) adapt well to specific domains or tasks with fine-tuning, and federated learning (FL) enables the potential for privacy-preserving fine-tuning of the FMs with on-device local data.
no code implementations • 28 Oct 2023 • Aleksandar Armacki, Pranay Sharma, Gauri Joshi, Dragana Bajovic, Dusan Jakovetic, Soummya Kar
First, for non-convex costs and component-wise nonlinearities, we establish a convergence rate arbitrarily close to $\mathcal{O}\left(t^{-\frac{1}{4}}\right)$, whose exponent is independent of noise and problem parameters.
no code implementations • ICCV 2023 • Yae Jee Cho, Gauri Joshi, Dimitrios Dimitriadis
For both cross-device and cross-silo settings, we show that FedLabel outperforms other semi-supervised FL baselines by $8$-$24\%$, and even outperforms standard fully supervised FL baselines ($100\%$ labeled data) with only $5$-$20\%$ of labeled data.
no code implementations • 18 May 2023 • Jiin Woo, Gauri Joshi, Yuejie Chi
When the data used for reinforcement learning (RL) are collected by multiple agents in a distributed manner, federated versions of RL algorithms allow collaborative learning without the need for agents to share their local data.
no code implementations • 8 Feb 2023 • Pranay Sharma, Rohan Panda, Gauri Joshi
We analyze the convergence of the proposed algorithm for classes of nonconvex-concave and nonconvex-nonconcave functions and characterize the impact of heterogeneous client data, partial client participation, and heterogeneous local computations.
no code implementations • 6 Feb 2023 • Yae Jee Cho, Pranay Sharma, Gauri Joshi, Zheng Xu, Satyen Kale, Tong Zhang
Federated Averaging (FedAvg) and its variants are the most popular optimization algorithms in federated learning (FL).
2 code implementations • 23 Jan 2023 • Divyansh Jhunjhunwala, Shiqiang Wang, Gauri Joshi
Federated Averaging (FedAvg) remains the most popular algorithm for Federated Learning (FL) optimization due to its simple implementation, stateless nature, and privacy guarantees combined with secure aggregation.
1 code implementation • 28 Jul 2022 • Divyansh Jhunjhunwala, Pranay Sharma, Aushim Nagarkatti, Gauri Joshi
To remedy this, we propose FedVARP, a novel variance reduction algorithm applied at the server that eliminates error due to partial client participation.
no code implementations • 9 Jul 2022 • Neelkamal Bhuyan, Sharayu Moharir, Gauri Joshi
Federated Learning (FL) is a variant of distributed learning where edge devices collaborate to learn a model without sharing their data with the central server or each other.
no code implementations • 21 Jun 2022 • Sajad Khodadadian, Pranay Sharma, Gauri Joshi, Siva Theja Maguluri
Federated reinforcement learning is a framework in which $N$ agents collaboratively learn a global model, without sharing their individual data and policies.
no code implementations • 9 Jun 2022 • Jianyu Wang, Rudrajit Das, Gauri Joshi, Satyen Kale, Zheng Xu, Tong Zhang
Motivated by this observation, we propose a new quantity, average drift at optimum, to measure the effects of data heterogeneity, and explicitly use it to present a new theoretical analysis of FedAvg.
1 code implementation • 1 Jun 2022 • Ellango Jothimurugesan, Kevin Hsieh, Jianyu Wang, Gauri Joshi, Phillip B. Gibbons
Federated Learning (FL) under distributed concept drift is a largely unexplored area.
no code implementations • 30 May 2022 • Yae Jee Cho, Divyansh Jhunjhunwala, Tian Li, Virginia Smith, Gauri Joshi
We provide convergence guarantees for MaxFL and show that MaxFL achieves a $22$-$40\%$ and $18$-$50\%$ test accuracy improvement for the training clients and unseen clients respectively, compared to a wide range of FL modeling approaches, including those that tackle data heterogeneity, aim to incentivize clients, and learn personalized or fair models.
no code implementations • 27 Apr 2022 • Yae Jee Cho, Andre Manoel, Gauri Joshi, Robert Sim, Dimitrios Dimitriadis
In this work, we propose a novel ensemble knowledge transfer method named Fed-ET in which small models (different in architecture) are trained on clients, and used to train a larger model at the server.
no code implementations • 9 Mar 2022 • Pranay Sharma, Rohan Panda, Gauri Joshi, Pramod K. Varshney
In this paper, we consider nonconvex minimax optimization, which is gaining prominence in many modern machine learning applications such as GANs.
no code implementations • 28 Jan 2022 • Jianyu Wang, Hang Qi, Ankit Singh Rawat, Sashank Reddi, Sagar Waghmare, Felix X. Yu, Gauri Joshi
In classical federated learning, the clients contribute to the overall training by communicating local updates for the underlying model on their private data to a coordinating server.
no code implementations • NeurIPS 2021 • Divyansh Jhunjhunwala, Ankur Mallick, Advait Gadhikar, Swanand Kadhe, Gauri Joshi
We study the problem of estimating at a central server the mean of a set of vectors distributed across several nodes (one vector per node).
no code implementations • 16 Sep 2021 • Yae Jee Cho, Jianyu Wang, Tarun Chiruvolu, Gauri Joshi
Personalized federated learning (FL) aims to train model(s) that can perform well for individual clients that are highly data and system heterogeneous.
no code implementations • 10 Sep 2021 • Samarth Gupta, Gauri Joshi, Osman Yağan
In this paper we consider the problem of best-arm identification in multi-armed bandits in the fixed confidence setting, where the goal is to identify, with probability $1-\delta$ for some $\delta>0$, the arm with the highest mean reward in minimum possible samples from the set of arms $\mathcal{K}$.
2 code implementations • 14 Jul 2021 • Jianyu Wang, Zachary Charles, Zheng Xu, Gauri Joshi, H. Brendan McMahan, Blaise Aguera y Arcas, Maruan Al-Shedivat, Galen Andrew, Salman Avestimehr, Katharine Daly, Deepesh Data, Suhas Diggavi, Hubert Eichner, Advait Gadhikar, Zachary Garrett, Antonious M. Girgis, Filip Hanzely, Andrew Hard, Chaoyang He, Samuel Horvath, Zhouyuan Huo, Alex Ingerman, Martin Jaggi, Tara Javidi, Peter Kairouz, Satyen Kale, Sai Praneeth Karimireddy, Jakub Konecny, Sanmi Koyejo, Tian Li, Luyang Liu, Mehryar Mohri, Hang Qi, Sashank J. Reddi, Peter Richtarik, Karan Singhal, Virginia Smith, Mahdi Soltanolkotabi, Weikang Song, Ananda Theertha Suresh, Sebastian U. Stich, Ameet Talwalkar, Hongyi Wang, Blake Woodworth, Shanshan Wu, Felix X. Yu, Honglin Yuan, Manzil Zaheer, Mi Zhang, Tong Zhang, Chunxiang Zheng, Chen Zhu, Wennan Zhu
Federated learning and analytics are a distributed approach for collaboratively learning models (or statistics) from decentralized data, motivated by and designed for privacy protection.
no code implementations • 8 Jun 2021 • Tuhinangshu Choudhury, Gauri Joshi, Weina Wang, Sanjay Shakkottai
In multi-server queueing systems where there is no central queue holding all incoming jobs, job dispatching policies are used to assign incoming jobs to the queue at one of the servers.
no code implementations • 4 Jun 2021 • Jianyu Wang, Zheng Xu, Zachary Garrett, Zachary Charles, Luyang Liu, Gauri Joshi
Popular optimization algorithms of FL use vanilla (stochastic) gradient descent for both local updates at clients and global updates at the aggregating server.
no code implementations • 8 Feb 2021 • Divyansh Jhunjhunwala, Advait Gadhikar, Gauri Joshi, Yonina C. Eldar
Communication of model updates between client nodes and the central aggregating server is a major bottleneck in federated learning, especially in bandwidth-limited settings and high-dimensional models.
no code implementations • 14 Dec 2020 • Yae Jee Cho, Samarth Gupta, Gauri Joshi, Osman Yağan
Due to communication constraints and intermittent client availability in federated learning, only a subset of clients can participate in each training round.
no code implementations • 3 Oct 2020 • Yae Jee Cho, Jianyu Wang, Gauri Joshi
Federated learning is a distributed optimization paradigm that enables a large number of resource-limited client nodes to cooperatively train a model without data sharing.
1 code implementation • 18 Jul 2020 • Ankur Mallick, Chaitanya Dwivedi, Bhavya Kailkhura, Gauri Joshi, T. Yong-Jin Han
In this work, we show that the uncertainty estimation capability of state-of-the-art BNNs and Deep Ensemble models degrades significantly when the amount of training data is small.
1 code implementation • NeurIPS 2020 • Jianyu Wang, Qinghua Liu, Hao Liang, Gauri Joshi, H. Vincent Poor
In federated optimization, heterogeneity in the clients' local datasets and computation speeds results in large variations in the number of local updates performed by each client in each communication round.
no code implementations • 23 Mar 2020 • Sanghamitra Dutta, Jianyu Wang, Gauri Joshi
Distributed Stochastic Gradient Descent (SGD) when run in a synchronous manner, suffers from delays in runtime as it waits for the slowest workers (stragglers).
no code implementations • 12 Mar 2020 • Xiaoxi Zhang, Jian-Yu Wang, Gauri Joshi, Carlee Joe-Wong
Due to the massive size of the neural network models and training datasets used in machine learning today, it is imperative to distribute stochastic gradient descent (SGD) by splitting up tasks such as gradient evaluation across multiple worker nodes.
1 code implementation • 21 Feb 2020 • Jianyu Wang, Hao Liang, Gauri Joshi
In this paper, we propose an algorithmic approach named Overlap-Local-SGD (and its momentum variant) to overlap the communication and computation so as to speedup the distributed training procedure.
9 code implementations • 10 Dec 2019 • Peter Kairouz, H. Brendan McMahan, Brendan Avent, Aurélien Bellet, Mehdi Bennis, Arjun Nitin Bhagoji, Kallista Bonawitz, Zachary Charles, Graham Cormode, Rachel Cummings, Rafael G. L. D'Oliveira, Hubert Eichner, Salim El Rouayheb, David Evans, Josh Gardner, Zachary Garrett, Adrià Gascón, Badih Ghazi, Phillip B. Gibbons, Marco Gruteser, Zaid Harchaoui, Chaoyang He, Lie He, Zhouyuan Huo, Ben Hutchinson, Justin Hsu, Martin Jaggi, Tara Javidi, Gauri Joshi, Mikhail Khodak, Jakub Konečný, Aleksandra Korolova, Farinaz Koushanfar, Sanmi Koyejo, Tancrède Lepoint, Yang Liu, Prateek Mittal, Mehryar Mohri, Richard Nock, Ayfer Özgür, Rasmus Pagh, Mariana Raykova, Hang Qi, Daniel Ramage, Ramesh Raskar, Dawn Song, Weikang Song, Sebastian U. Stich, Ziteng Sun, Ananda Theertha Suresh, Florian Tramèr, Praneeth Vepakomma, Jianyu Wang, Li Xiong, Zheng Xu, Qiang Yang, Felix X. Yu, Han Yu, Sen Zhao
FL embodies the principles of focused data collection and minimization, and can mitigate many of the systemic privacy risks and costs resulting from traditional, centralized machine learning and data science approaches.
2 code implementations • 6 Nov 2019 • Samarth Gupta, Shreyas Chaudhari, Gauri Joshi, Osman Yağan
We consider a multi-armed bandit framework where the rewards obtained by pulling different arms are correlated.
1 code implementation • 13 Oct 2019 • Ankur Mallick, Chaitanya Dwivedi, Bhavya Kailkhura, Gauri Joshi, T. Yong-Jin Han
Experiments on a variety of datasets show that our approach outperforms the state-of-the-art in GP kernel learning in both supervised and semi-supervised settings.
2 code implementations • 2 Oct 2019 • Angela H. Jiang, Daniel L. -K. Wong, Giulio Zhou, David G. Andersen, Jeffrey Dean, Gregory R. Ganger, Gauri Joshi, Michael Kaminksy, Michael Kozuch, Zachary C. Lipton, Padmanabhan Pillai
This paper introduces Selective-Backprop, a technique that accelerates the training of deep neural networks (DNNs) by prioritizing examples with high loss at each iteration.
4 code implementations • 23 May 2019 • Jianyu Wang, Anit Kumar Sahu, Zhouyi Yang, Gauri Joshi, Soummya Kar
This paper studies the problem of error-runtime trade-off, typically encountered in decentralized training based on stochastic gradient descent (SGD) using a given network.
no code implementations • 29 Mar 2019 • Alexander Ratner, Dan Alistarh, Gustavo Alonso, David G. Andersen, Peter Bailis, Sarah Bird, Nicholas Carlini, Bryan Catanzaro, Jennifer Chayes, Eric Chung, Bill Dally, Jeff Dean, Inderjit S. Dhillon, Alexandros Dimakis, Pradeep Dubey, Charles Elkan, Grigori Fursin, Gregory R. Ganger, Lise Getoor, Phillip B. Gibbons, Garth A. Gibson, Joseph E. Gonzalez, Justin Gottschlich, Song Han, Kim Hazelwood, Furong Huang, Martin Jaggi, Kevin Jamieson, Michael. I. Jordan, Gauri Joshi, Rania Khalaf, Jason Knight, Jakub Konečný, Tim Kraska, Arun Kumar, Anastasios Kyrillidis, Aparna Lakshmiratan, Jing Li, Samuel Madden, H. Brendan McMahan, Erik Meijer, Ioannis Mitliagkas, Rajat Monga, Derek Murray, Kunle Olukotun, Dimitris Papailiopoulos, Gennady Pekhimenko, Theodoros Rekatsinas, Afshin Rostamizadeh, Christopher Ré, Christopher De Sa, Hanie Sedghi, Siddhartha Sen, Virginia Smith, Alex Smola, Dawn Song, Evan Sparks, Ion Stoica, Vivienne Sze, Madeleine Udell, Joaquin Vanschoren, Shivaram Venkataraman, Rashmi Vinayak, Markus Weimer, Andrew Gordon Wilson, Eric Xing, Matei Zaharia, Ce Zhang, Ameet Talwalkar
Machine learning (ML) techniques are enjoying rapidly increasing adoption.
no code implementations • 19 Oct 2018 • Jianyu Wang, Gauri Joshi
Large-scale machine learning training, in particular distributed stochastic gradient descent, needs to be robust to inherent system variability such as node straggling and random communication delays.
no code implementations • 18 Oct 2018 • Samarth Gupta, Shreyas Chaudhari, Subhojyoti Mukherjee, Gauri Joshi, Osman Yağan
We consider a finite-armed structured bandit problem in which mean rewards of different arms are known functions of a common hidden parameter $\theta^*$.
no code implementations • 22 Aug 2018 • Jianyu Wang, Gauri Joshi
Communication-efficient SGD algorithms, which allow nodes to perform local updates and periodically synchronize local models, are highly effective in improving the speed and scalability of distributed SGD.
2 code implementations • 17 Aug 2018 • Samarth Gupta, Gauri Joshi, Osman Yağan
As a result, there are regimes where our algorithm achieves a $\mathcal{O}(1)$ regret as opposed to the typical logarithmic regret scaling of multi-armed bandit algorithms.
no code implementations • 16 Aug 2018 • Samarth Gupta, Gauri Joshi, Osman Yağan
At each time step, we choose one of the possible $K$ functions, $g_1, \ldots, g_K$ and observe the corresponding sample $g_i(X)$.
no code implementations • 3 Mar 2018 • Sanghamitra Dutta, Gauri Joshi, Soumyadip Ghosh, Parijat Dube, Priya Nagpurkar
Distributed Stochastic Gradient Descent (SGD) when run in a synchronous manner, suffers from delays in waiting for the slowest learners (stragglers).