Search Results for author: Zhaoxian Wu

Found 8 papers, 4 papers with code

Pipeline Gradient-based Model Training on Analog In-memory Accelerators

1 code implementation19 Oct 2024 Zhaoxian Wu, Quan Xiao, Tayfun Gokmen, Hsinyu Tsai, Kaoutar El Maghraoui, Tianyi Chen

Aiming to accelerate the training of large deep neural models (DNN) in an energy-efficient way, an analog in-memory computing (AIMC) accelerator emerges as a solution with immense potential.

Single-Timescale Multi-Sequence Stochastic Approximation Without Fixed Point Smoothness: Theories and Applications

no code implementations17 Oct 2024 Yue Huang, Zhaoxian Wu, Shiqian Ma, Qing Ling

Stochastic approximation (SA) that involves multiple coupled sequences, known as multiple-sequence SA (MSSA), finds diverse applications in the fields of signal processing and machine learning.

Bilevel Optimization

On the Trade-off between Flatness and Optimization in Distributed Learning

no code implementations28 Jun 2024 Ying Cao, Zhaoxian Wu, Kun Yuan, Ali H. Sayed

This paper proposes a theoretical framework to evaluate and compare the performance of gradient-descent algorithms for distributed learning in relation to their behavior around local minima in nonconvex environments.

Towards Exact Gradient-based Training on Analog In-memory Computing

no code implementations18 Jun 2024 Zhaoxian Wu, Tayfun Gokmen, Malte J. Rasch, Tianyi Chen

Given the high economic and environmental costs of using large vision or language models, analog in-memory accelerators present a promising solution for energy-efficient AI.

Byzantine-Robust Distributed Online Learning: Taming Adversarial Participants in An Adversarial Environment

1 code implementation16 Jul 2023 Xingrong Dong, Zhaoxian Wu, Qing Ling, Zhi Tian

But we prove that, even with a class of state-of-the-art robust aggregation rules, in an adversarial environment and in the presence of Byzantine participants, distributed online gradient descent can only achieve a linear adversarial regret bound, which is tight.

Decision Making

Byzantine-Robust Variance-Reduced Federated Learning over Distributed Non-i.i.d. Data

2 code implementations17 Sep 2020 Jie Peng, Zhaoxian Wu, Qing Ling, Tianyi Chen

We prove that the proposed method reaches a neighborhood of the optimal solution at a linear convergence rate and the learning error is determined by the number of Byzantine workers.

Federated Learning

Federated Variance-Reduced Stochastic Gradient Descent with Robustness to Byzantine Attacks

no code implementations29 Dec 2019 Zhaoxian Wu, Qing Ling, Tianyi Chen, Georgios B. Giannakis

This motivates us to reduce the variance of stochastic gradients as a means of robustifying SGD in the presence of Byzantine attacks.

Communication-Censored Distributed Stochastic Gradient Descent

1 code implementation9 Sep 2019 Weiyu Li, Tianyi Chen, Liping Li, Zhaoxian Wu, Qing Ling

Specifically, in CSGD, the latest mini-batch stochastic gradient at a worker will be transmitted to the server if and only if it is sufficiently informative.

Quantization Stochastic Optimization

Cannot find the paper you are looking for? You can Submit a new open access paper.