Search Results for author: Xidong Wu

Found 11 papers, 3 papers with code

Auto-Train-Once: Controller Network Guided Automatic Network Pruning from Scratch

1 code implementation21 Mar 2024 Xidong Wu, Shangqian Gao, Zeyu Zhang, Zhenzhen Li, Runxue Bao, yanfu Zhang, Xiaoqian Wang, Heng Huang

Current techniques for deep neural network (DNN) pruning often involve intricate multi-step processes that require domain-specific expertise, making their widespread adoption challenging.

Network Pruning

On the Role of Server Momentum in Federated Learning

no code implementations19 Dec 2023 Jianhui Sun, Xidong Wu, Heng Huang, Aidong Zhang

To our best knowledge, this is the first work that thoroughly analyzes the performances of server momentum with a hyperparameter scheduler and system heterogeneity.

Federated Learning

Leveraging Foundation Models to Improve Lightweight Clients in Federated Learning

no code implementations14 Nov 2023 Xidong Wu, Wan-Yi Lin, Devin Willmott, Filipe Condessa, Yufei Huang, Zhenzhen Li, Madan Ravi Ganesh

Federated Learning (FL) is a distributed training paradigm that enables clients scattered across the world to cooperatively learn a global model without divulging confidential data.

Federated Learning

Solving a Class of Non-Convex Minimax Optimization in Federated Learning

1 code implementation NeurIPS 2023 Xidong Wu, Jianhui Sun, Zhengmian Hu, Aidong Zhang, Heng Huang

We propose FL algorithms (FedSGDA+ and FedSGDA-M) and reduce existing complexity results for the most common minimax problems.

Federated Learning

Serverless Federated AUPRC Optimization for Multi-Party Collaborative Imbalanced Data Mining

1 code implementation6 Aug 2023 Xidong Wu, Zhengmian Hu, Jian Pei, Heng Huang

To address the above challenge, we study the serverless multi-party collaborative AUPRC maximization problem since serverless multi-party collaborative training can cut down the communications cost by avoiding the server node bottleneck, and reformulate it as a conditional stochastic optimization problem in a serverless multi-party collaborative learning setting and propose a new ServerLess biAsed sTochastic gradiEnt (SLATE) algorithm to directly optimize the AUPRC.

Federated Learning Stochastic Optimization

Performance and Energy Consumption of Parallel Machine Learning Algorithms

no code implementations1 May 2023 Xidong Wu, Preston Brazzle, Stephen Cahoon

Parallelization of training algorithms is a common strategy to speed up the process of training.

Decentralized Riemannian Algorithm for Nonconvex Minimax Problems

no code implementations8 Feb 2023 Xidong Wu, Zhengmian Hu, Heng Huang

The minimax optimization over Riemannian manifolds (possibly nonconvex constraints) has been actively applied to solve many problems, such as robust dimensionality reduction and deep neural networks with orthogonal weights (Stiefel manifold).

Dimensionality Reduction

Faster Adaptive Federated Learning

no code implementations2 Dec 2022 Xidong Wu, Feihu Huang, Zhengmian Hu, Heng Huang

Federated learning has attracted increasing attention with the emergence of distributed data.

Federated Learning Image Classification +1

Distributed Dynamic Safe Screening Algorithms for Sparse Regularization

no code implementations23 Apr 2022 Runxue Bao, Xidong Wu, Wenhan Xian, Heng Huang

To the best of our knowledge, this is the first work of distributed safe dynamic screening method.

Distributed Optimization

Efficient Mirror Descent Ascent Methods for Nonsmooth Minimax Problems

no code implementations NeurIPS 2021 Feihu Huang, Xidong Wu, Heng Huang

For our stochastic algorithms, we first prove that the mini-batch stochastic mirror descent ascent (SMDA) method obtains a sample complexity of $O(\kappa^3\epsilon^{-4})$ for finding an $\epsilon$-stationary point, where $\kappa$ denotes the condition number.

AdaGDA: Faster Adaptive Gradient Descent Ascent Methods for Minimax Optimization

no code implementations30 Jun 2021 Feihu Huang, Xidong Wu, Zhengmian Hu

Specifically, we propose a fast Adaptive Gradient Descent Ascent (AdaGDA) method based on the basic momentum technique, which reaches a lower gradient complexity of $\tilde{O}(\kappa^4\epsilon^{-4})$ for finding an $\epsilon$-stationary point without large batches, which improves the existing results of the adaptive GDA methods by a factor of $O(\sqrt{\kappa})$.

Cannot find the paper you are looking for? You can Submit a new open access paper.