Search Results for author: Feihu Huang

Found 37 papers, 6 papers with code

Adaptive Mirror Descent Bilevel Optimization

no code implementations • 8 Nov 2023 • Feihu Huang

In the paper, we propose a class of efficient adaptive bilevel methods based on mirror descent for nonconvex bilevel optimization, where its upper-level problem is nonconvex possibly with nonsmooth regularization, and its lower-level problem is also nonconvex while satisfies Polyak-{\L}ojasiewicz (PL) condition.

Bilevel Optimization

Paper
Add Code

Near-Optimal Decentralized Momentum Method for Nonconvex-PL Minimax Problems

no code implementations • 21 Apr 2023 • Feihu Huang, Songcan Chen

Moreover, we provide a solid convergence analysis for our DM-GDA method, and prove that it obtains a near-optimal gradient complexity of $O(\epsilon^{-3})$ for finding an $\epsilon$-stationary solution of the nonconvex-PL stochastic minimax problems, which reaches the lower bound of nonconvex stochastic optimization.

Stochastic Optimization

Paper
Add Code

On Momentum-Based Gradient Methods for Bilevel Optimization with Nonconvex Lower-Level

no code implementations • 7 Mar 2023 • Feihu Huang

To fill this gap, in the paper, we study a class of nonconvex bilevel optimization problems, where both upper-level and lower-level problems are nonconvex, and the lower-level problem satisfies Polyak-{\L}ojasiewicz (PL) condition.

Bilevel Optimization Continual Learning +2

Paper
Add Code

Enhanced Adaptive Gradient Algorithms for Nonconvex-PL Minimax Optimization

no code implementations • 7 Mar 2023 • Feihu Huang

In the paper, we study a class of nonconvex nonconcave minimax optimization problems (i. e., $\min_x\max_y f(x, y)$), where $f(x, y)$ is possible nonconvex in $x$, and it is nonconcave and satisfies the Polyak-Lojasiewicz (PL) condition in $y$.

Paper
Add Code

FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted Dual Averaging

no code implementations • 13 Feb 2023 • Junyi Li, Feihu Huang, Heng Huang

This matches the best known rate for first-order FL algorithms and \textbf{FedDA-MVR} is the first adaptive FL algorithm that achieves this rate.

Federated Learning

Paper
Add Code

Communication-Efficient Federated Bilevel Optimization with Local and Global Lower Level Problems

no code implementations • 13 Feb 2023 • Junyi Li, Feihu Huang, Heng Huang

In this work, we investigate Federated Bilevel Optimization problems and propose a communication-efficient algorithm, named FedBiOAcc.

Bilevel Optimization Federated Learning +1

Paper
Add Code

Structural Alignment for Network Pruning through Partial Regularization

no code implementations • ICCV 2023 • Shangqian Gao, Zeyu Zhang, yanfu Zhang, Feihu Huang, Heng Huang

To mitigate this gap, we first learn a target sub-network during the model training process, and then we use this sub-network to guide the learning of model weights through partial regularization.

Network Pruning

Paper
Add Code

Faster Adaptive Federated Learning

no code implementations • 2 Dec 2022 • Xidong Wu, Feihu Huang, Zhengmian Hu, Heng Huang

Federated learning has attracted increasing attention with the emergence of distributed data.

Federated Learning Image Classification +1

Paper
Add Code

Adaptive Federated Minimax Optimization with Lower Complexities

no code implementations • 14 Nov 2022 • Feihu Huang, Xinrui Wang, Junyi Li, Songcan Chen

To fill this gap, in the paper, we study a class of nonconvex minimax optimization, and propose an efficient adaptive federated minimax optimization algorithm (i. e., AdaFGDA) to solve these distributed minimax problems.

Federated Learning Privacy Preserving

Paper
Add Code

Faster Adaptive Momentum-Based Federated Methods for Distributed Composition Optimization

no code implementations • 3 Nov 2022 • Feihu Huang

setting, and prove our algorithms obtain a lower sample and communication complexities simultaneously than the existing federated compositional algorithms.

Federated Learning Meta-Learning

Paper
Add Code

Fast Adaptive Federated Bilevel Optimization

no code implementations • 2 Nov 2022 • Feihu Huang

In the paper, thus, we propose a novel adaptive federated bilevel optimization algorithm (i. e., AdaFBiO) to solve the distributed bilevel optimization problems, where the objective function of Upper-Level (UL) problem is possibly nonconvex, and that of Lower-Level (LL) problem is strongly convex.

Bilevel Optimization Distributed Optimization +2

Paper
Add Code

Communication-Efficient Adam-Type Algorithms for Distributed Data Mining

no code implementations • 14 Oct 2022 • Wenhan Xian, Feihu Huang, Heng Huang

In our theoretical analysis, we prove that our new algorithm achieves a fast convergence rate of $O(\frac{1}{\sqrt{nT}} + \frac{1}{(k/d)^2 T})$ with the communication cost of $O(k \log(d))$ at each iteration.

Vocal Bursts Type Prediction

Paper
Add Code

Dateformer: Time-modeling Transformer for Longer-term Series Forecasting

1 code implementation • 12 Jul 2022 • Julong Young, Junhui Chen, Feihu Huang, Jian Peng

This, for fine-grained time series, leads to a bottleneck in information input and prediction output, which is mortal to long-term series forecasting.

Time Series Time Series Forecasting

Paper
Code

Local Stochastic Bilevel Optimization with Momentum-Based Variance Reduction

no code implementations • 3 May 2022 • Junyi Li, Feihu Huang, Heng Huang

Specifically, we first propose the FedBiO, a deterministic gradient-based algorithm and we show it requires $O(\epsilon^{-2})$ number of iterations to reach an $\epsilon$-stationary point.

BIG-bench Machine Learning Bilevel Optimization +3

Paper
Add Code

Optimal Underdamped Langevin MCMC Method

no code implementations • NeurIPS 2021 • Zhengmian Hu, Feihu Huang, Heng Huang

In the paper, we study the underdamped Langevin diffusion (ULD) with strongly-convex potential consisting of finite summation of $N$ smooth components, and propose an efficient discretization method, which requires $O(N+d^\frac{1}{3}N^\frac{2}{3}/\varepsilon^\frac{2}{3})$ gradient evaluations to achieve $\varepsilon$-error (in $\sqrt{\mathbb{E}{\lVert{\cdot}\rVert_2^2}}$ distance) for approximating $d$-dimensional ULD.

Paper
Add Code

Efficient Mirror Descent Ascent Methods for Nonsmooth Minimax Problems

no code implementations • NeurIPS 2021 • Feihu Huang, Xidong Wu, Heng Huang

For our stochastic algorithms, we first prove that the mini-batch stochastic mirror descent ascent (SMDA) method obtains a sample complexity of $O(\kappa^3\epsilon^{-4})$ for finding an $\epsilon$-stationary point, where $\kappa$ denotes the condition number.

Paper
Add Code

A Faster Decentralized Algorithm for Nonconvex Minimax Problems

no code implementations • NeurIPS 2021 • Wenhan Xian, Feihu Huang, yanfu Zhang, Heng Huang

We prove that our DM-HSGD algorithm achieves stochastic first-order oracle (SFO) complexity of $O(\kappa^3 \epsilon^{-3})$ for decentralized stochastic nonconvex-strongly-concave problem to search an $\epsilon$-stationary point, which improves the exiting best theoretical results.

Paper
Add Code

Enhanced Bilevel Optimization via Bregman Distance

no code implementations • 26 Jul 2021 • Feihu Huang, Junyi Li, Shangqian Gao, Heng Huang

Specifically, we propose a bilevel optimization method based on Bregman distance (BiO-BreD) to solve deterministic bilevel problems, which achieves a lower computational complexity than the best known results.

Bilevel Optimization Hyperparameter Optimization +2

Paper
Add Code

AdaGDA: Faster Adaptive Gradient Descent Ascent Methods for Minimax Optimization

no code implementations • 30 Jun 2021 • Feihu Huang, Xidong Wu, Zhengmian Hu

Specifically, we propose a fast Adaptive Gradient Descent Ascent (AdaGDA) method based on the basic momentum technique, which reaches a lower gradient complexity of $\tilde{O}(\kappa^4\epsilon^{-4})$ for finding an $\epsilon$-stationary point without large batches, which improves the existing results of the adaptive GDA methods by a factor of $O(\sqrt{\kappa})$.

Paper
Add Code

Bregman Gradient Policy Optimization

1 code implementation • ICLR 2022 • Feihu Huang, Shangqian Gao, Heng Huang

In the paper, we design a novel Bregman gradient policy optimization framework for reinforcement learning based on Bregman divergences and momentum techniques.

reinforcement-learning Reinforcement Learning (RL)

Paper
Code

BiAdam: Fast Adaptive Bilevel Optimization Methods

no code implementations • 21 Jun 2021 • Feihu Huang, Junyi Li, Shangqian Gao

To fill this gap, in the paper, we propose a novel fast adaptive bilevel framework to solve stochastic bilevel optimization problems that the outer problem is possibly nonconvex and the inner problem is strongly convex.

Bilevel Optimization Meta-Learning +1

Paper
Add Code

Compositional federated learning: Applications in distributionally robust averaging and meta learning

no code implementations • 21 Jun 2021 • Feihu Huang, Junyi Li

In the paper, we propose an effective and efficient Compositional Federated Learning (ComFedL) algorithm for solving a new compositional Federated Learning (FL) framework, which frequently appears in many data mining and machine learning problems with a hierarchical structure such as distributionally robust FL and model-agnostic meta learning (MAML).

BIG-bench Machine Learning Federated Learning +2

Paper
Add Code

Network Pruning via Performance Maximization

1 code implementation • CVPR 2021 • Shangqian Gao, Feihu Huang, Weidong Cai, Heng Huang

Specifically, we train a stand-alone neural network to predict sub-networks' performance and then maximize the output of the network as a proxy of accuracy to guide pruning.

Model Compression Network Pruning

Paper
Code

SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients

1 code implementation • NeurIPS 2021 • Feihu Huang, Junyi Li, Heng Huang

To fill this gap, we propose a faster and universal framework of adaptive gradients (i. e., SUPER-ADAM) by introducing a universal adaptive matrix that includes most existing adaptive gradient forms.

Paper
Code

A New Framework for Variance-Reduced Hamiltonian Monte Carlo

no code implementations • 9 Feb 2021 • Zhengmian Hu, Feihu Huang, Heng Huang

Moreover, our HMC methods with biased gradient estimators, such as SARAH and SARGE, require $\tilde{O}(N+\sqrt{N} \kappa^2 d^{\frac{1}{2}} \varepsilon^{-1})$ gradient complexity, which has the same dependency on condition number $\kappa$ and dimension $d$ as full gradient method, but improves the dependency of sample size $N$ for a factor of $N^\frac{1}{2}$.

Paper
Add Code

Model Compression via Hyper-Structure Network

no code implementations • 1 Jan 2021 • Shangqian Gao, Feihu Huang, Heng Huang

In this paper, we propose a novel channel pruning method to solve the problem of compression and acceleration of Convolutional Neural Networks (CNNs).

Model Compression

Paper
Add Code

Gradient Descent Ascent for Minimax Problems on Riemannian Manifolds

no code implementations • 13 Oct 2020 • Feihu Huang, Shangqian Gao

At the same time, we present an effective Riemannian stochastic gradient descent ascent (RSGDA) algorithm for the stochastic minimax optimization, which has a sample complexity of $O(\kappa^4\epsilon^{-4})$ for finding an $\epsilon$-stationary solution.

Paper
Add Code

Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization

no code implementations • 18 Aug 2020 • Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Our Acc-MDA achieves a low gradient complexity of $\tilde{O}(\kappa_y^{4. 5}\epsilon^{-3})$ without requiring large batches for finding an $\epsilon$-stationary point.

Adversarial Attack

Paper
Add Code

Faster Stochastic Alternating Direction Method of Multipliers for Nonconvex Optimization

no code implementations • 4 Aug 2020 • Feihu Huang, Songcan Chen, Heng Huang

Our theoretical analysis shows that the online SPIDER-ADMM has the IFO complexity of $\mathcal{O}(\epsilon^{-\frac{3}{2}})$, which improves the existing best results by a factor of $\mathcal{O}(\epsilon^{-\frac{1}{2}})$.

Paper
Add Code

Accelerated Stochastic Gradient-free and Projection-free Methods

1 code implementation • ICML 2020 • Feihu Huang, Lue Tao, Songcan Chen

To relax the large batches required in the Acc-SZOFW, we further propose a novel accelerated stochastic zeroth-order Frank-Wolfe (Acc-SZOFW*) based on a new variance reduced technique of STORM, which still reaches the function query complexity of $O(d\epsilon^{-3})$ in the stochastic problem without relying on any large batches.

Adversarial Attack

Paper
Code

Momentum-Based Policy Gradient Methods

1 code implementation • ICML 2020 • Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

In particular, we present a non-adaptive version of IS-MBPG method, i. e., IS-MBPG*, which also reaches the best known sample complexity of $O(\epsilon^{-3})$ without any large batches.

Policy Gradient Methods

Paper
Code

Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity

no code implementations • 30 Jul 2019 • Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Zeroth-order (a. k. a, derivative-free) methods are a class of effective optimization methods for solving complex machine learning problems, where gradients of the objective functions are not available or computationally prohibitive.

Adversarial Attack

Paper
Add Code

Zeroth-Order Stochastic Alternating Direction Method of Multipliers for Nonconvex Nonsmooth Optimization

no code implementations • 29 May 2019 • Feihu Huang, Shangqian Gao, Songcan Chen, Heng Huang

In particular, our methods not only reach the best convergence rate $O(1/T)$ for the nonconvex optimization, but also are able to effectively solve many complex machine learning problems with multiple regularized penalties and constraints.

Adversarial Attack BIG-bench Machine Learning +1

Paper
Add Code

Faster Gradient-Free Proximal Stochastic Methods for Nonconvex Nonsmooth Optimization

no code implementations • 16 Feb 2019 • Feihu Huang, Bin Gu, Zhouyuan Huo, Songcan Chen, Heng Huang

Proximal gradient method has been playing an important role to solve many machine learning tasks, especially for the nonsmooth problems.

BIG-bench Machine Learning

Paper
Add Code

Mini-Batch Stochastic ADMMs for Nonconvex Nonsmooth Optimization

no code implementations • 8 Feb 2018 • Feihu Huang, Songcan Chen

Moreover, we extend the mini-batch stochastic gradient method to both the nonconvex SVRG-ADMM and SAGA-ADMM proposed in our initial manuscript \cite{huang2016stochastic}, and prove these mini-batch stochastic ADMMs also reaches the convergence rate of $O(1/T)$ without condition on the mini-batch size.

Paper
Add Code

Linear Convergence of Accelerated Stochastic Gradient Descent for Nonconvex Nonsmooth Optimization

no code implementations • 26 Apr 2017 • Feihu Huang, Songcan Chen

To the best of our knowledge, it is first proved that the accelerated SGD method converges linearly to the local minimum of the nonconvex optimization.

Paper
Add Code

Stochastic Alternating Direction Method of Multipliers with Variance Reduction for Nonconvex Optimization

no code implementations • 10 Oct 2016 • Feihu Huang, Songcan Chen, Zhaosong Lu

Specifically, the first class called the nonconvex stochastic variance reduced gradient ADMM (SVRG-ADMM), uses a multi-stage scheme to progressively reduce the variance of stochastic gradients.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.