Search Results for author: Feihu Huang

Found 37 papers, 6 papers with code

Adaptive Mirror Descent Bilevel Optimization

no code implementations8 Nov 2023 Feihu Huang

In the paper, we propose a class of efficient adaptive bilevel methods based on mirror descent for nonconvex bilevel optimization, where its upper-level problem is nonconvex possibly with nonsmooth regularization, and its lower-level problem is also nonconvex while satisfies Polyak-{\L}ojasiewicz (PL) condition.

Bilevel Optimization

Near-Optimal Decentralized Momentum Method for Nonconvex-PL Minimax Problems

no code implementations21 Apr 2023 Feihu Huang, Songcan Chen

Moreover, we provide a solid convergence analysis for our DM-GDA method, and prove that it obtains a near-optimal gradient complexity of $O(\epsilon^{-3})$ for finding an $\epsilon$-stationary solution of the nonconvex-PL stochastic minimax problems, which reaches the lower bound of nonconvex stochastic optimization.

Stochastic Optimization

On Momentum-Based Gradient Methods for Bilevel Optimization with Nonconvex Lower-Level

no code implementations7 Mar 2023 Feihu Huang

To fill this gap, in the paper, we study a class of nonconvex bilevel optimization problems, where both upper-level and lower-level problems are nonconvex, and the lower-level problem satisfies Polyak-{\L}ojasiewicz (PL) condition.

Bilevel Optimization Continual Learning +2

Enhanced Adaptive Gradient Algorithms for Nonconvex-PL Minimax Optimization

no code implementations7 Mar 2023 Feihu Huang

In the paper, we study a class of nonconvex nonconcave minimax optimization problems (i. e., $\min_x\max_y f(x, y)$), where $f(x, y)$ is possible nonconvex in $x$, and it is nonconcave and satisfies the Polyak-Lojasiewicz (PL) condition in $y$.

FedDA: Faster Framework of Local Adaptive Gradient Methods via Restarted Dual Averaging

no code implementations13 Feb 2023 Junyi Li, Feihu Huang, Heng Huang

This matches the best known rate for first-order FL algorithms and \textbf{FedDA-MVR} is the first adaptive FL algorithm that achieves this rate.

Federated Learning

Communication-Efficient Federated Bilevel Optimization with Local and Global Lower Level Problems

no code implementations13 Feb 2023 Junyi Li, Feihu Huang, Heng Huang

In this work, we investigate Federated Bilevel Optimization problems and propose a communication-efficient algorithm, named FedBiOAcc.

Bilevel Optimization Federated Learning +1

Structural Alignment for Network Pruning through Partial Regularization

no code implementations ICCV 2023 Shangqian Gao, Zeyu Zhang, yanfu Zhang, Feihu Huang, Heng Huang

To mitigate this gap, we first learn a target sub-network during the model training process, and then we use this sub-network to guide the learning of model weights through partial regularization.

Network Pruning

Faster Adaptive Federated Learning

no code implementations2 Dec 2022 Xidong Wu, Feihu Huang, Zhengmian Hu, Heng Huang

Federated learning has attracted increasing attention with the emergence of distributed data.

Federated Learning Image Classification +1

Adaptive Federated Minimax Optimization with Lower Complexities

no code implementations14 Nov 2022 Feihu Huang, Xinrui Wang, Junyi Li, Songcan Chen

To fill this gap, in the paper, we study a class of nonconvex minimax optimization, and propose an efficient adaptive federated minimax optimization algorithm (i. e., AdaFGDA) to solve these distributed minimax problems.

Federated Learning Privacy Preserving

Faster Adaptive Momentum-Based Federated Methods for Distributed Composition Optimization

no code implementations3 Nov 2022 Feihu Huang

setting, and prove our algorithms obtain a lower sample and communication complexities simultaneously than the existing federated compositional algorithms.

Federated Learning Meta-Learning

Fast Adaptive Federated Bilevel Optimization

no code implementations2 Nov 2022 Feihu Huang

In the paper, thus, we propose a novel adaptive federated bilevel optimization algorithm (i. e., AdaFBiO) to solve the distributed bilevel optimization problems, where the objective function of Upper-Level (UL) problem is possibly nonconvex, and that of Lower-Level (LL) problem is strongly convex.

Bilevel Optimization Distributed Optimization +2

Communication-Efficient Adam-Type Algorithms for Distributed Data Mining

no code implementations14 Oct 2022 Wenhan Xian, Feihu Huang, Heng Huang

In our theoretical analysis, we prove that our new algorithm achieves a fast convergence rate of $O(\frac{1}{\sqrt{nT}} + \frac{1}{(k/d)^2 T})$ with the communication cost of $O(k \log(d))$ at each iteration.

Vocal Bursts Type Prediction

Dateformer: Time-modeling Transformer for Longer-term Series Forecasting

1 code implementation12 Jul 2022 Julong Young, Junhui Chen, Feihu Huang, Jian Peng

This, for fine-grained time series, leads to a bottleneck in information input and prediction output, which is mortal to long-term series forecasting.

Time Series Time Series Forecasting

Local Stochastic Bilevel Optimization with Momentum-Based Variance Reduction

no code implementations3 May 2022 Junyi Li, Feihu Huang, Heng Huang

Specifically, we first propose the FedBiO, a deterministic gradient-based algorithm and we show it requires $O(\epsilon^{-2})$ number of iterations to reach an $\epsilon$-stationary point.

BIG-bench Machine Learning Bilevel Optimization +3

Optimal Underdamped Langevin MCMC Method

no code implementations NeurIPS 2021 Zhengmian Hu, Feihu Huang, Heng Huang

In the paper, we study the underdamped Langevin diffusion (ULD) with strongly-convex potential consisting of finite summation of $N$ smooth components, and propose an efficient discretization method, which requires $O(N+d^\frac{1}{3}N^\frac{2}{3}/\varepsilon^\frac{2}{3})$ gradient evaluations to achieve $\varepsilon$-error (in $\sqrt{\mathbb{E}{\lVert{\cdot}\rVert_2^2}}$ distance) for approximating $d$-dimensional ULD.

Efficient Mirror Descent Ascent Methods for Nonsmooth Minimax Problems

no code implementations NeurIPS 2021 Feihu Huang, Xidong Wu, Heng Huang

For our stochastic algorithms, we first prove that the mini-batch stochastic mirror descent ascent (SMDA) method obtains a sample complexity of $O(\kappa^3\epsilon^{-4})$ for finding an $\epsilon$-stationary point, where $\kappa$ denotes the condition number.

A Faster Decentralized Algorithm for Nonconvex Minimax Problems

no code implementations NeurIPS 2021 Wenhan Xian, Feihu Huang, yanfu Zhang, Heng Huang

We prove that our DM-HSGD algorithm achieves stochastic first-order oracle (SFO) complexity of $O(\kappa^3 \epsilon^{-3})$ for decentralized stochastic nonconvex-strongly-concave problem to search an $\epsilon$-stationary point, which improves the exiting best theoretical results.

Enhanced Bilevel Optimization via Bregman Distance

no code implementations26 Jul 2021 Feihu Huang, Junyi Li, Shangqian Gao, Heng Huang

Specifically, we propose a bilevel optimization method based on Bregman distance (BiO-BreD) to solve deterministic bilevel problems, which achieves a lower computational complexity than the best known results.

Bilevel Optimization Hyperparameter Optimization +2

AdaGDA: Faster Adaptive Gradient Descent Ascent Methods for Minimax Optimization

no code implementations30 Jun 2021 Feihu Huang, Xidong Wu, Zhengmian Hu

Specifically, we propose a fast Adaptive Gradient Descent Ascent (AdaGDA) method based on the basic momentum technique, which reaches a lower gradient complexity of $\tilde{O}(\kappa^4\epsilon^{-4})$ for finding an $\epsilon$-stationary point without large batches, which improves the existing results of the adaptive GDA methods by a factor of $O(\sqrt{\kappa})$.

Bregman Gradient Policy Optimization

1 code implementation ICLR 2022 Feihu Huang, Shangqian Gao, Heng Huang

In the paper, we design a novel Bregman gradient policy optimization framework for reinforcement learning based on Bregman divergences and momentum techniques.

reinforcement-learning Reinforcement Learning (RL)

BiAdam: Fast Adaptive Bilevel Optimization Methods

no code implementations21 Jun 2021 Feihu Huang, Junyi Li, Shangqian Gao

To fill this gap, in the paper, we propose a novel fast adaptive bilevel framework to solve stochastic bilevel optimization problems that the outer problem is possibly nonconvex and the inner problem is strongly convex.

Bilevel Optimization Meta-Learning +1

Compositional federated learning: Applications in distributionally robust averaging and meta learning

no code implementations21 Jun 2021 Feihu Huang, Junyi Li

In the paper, we propose an effective and efficient Compositional Federated Learning (ComFedL) algorithm for solving a new compositional Federated Learning (FL) framework, which frequently appears in many data mining and machine learning problems with a hierarchical structure such as distributionally robust FL and model-agnostic meta learning (MAML).

BIG-bench Machine Learning Federated Learning +2

Network Pruning via Performance Maximization

1 code implementation CVPR 2021 Shangqian Gao, Feihu Huang, Weidong Cai, Heng Huang

Specifically, we train a stand-alone neural network to predict sub-networks' performance and then maximize the output of the network as a proxy of accuracy to guide pruning.

Model Compression Network Pruning

SUPER-ADAM: Faster and Universal Framework of Adaptive Gradients

1 code implementation NeurIPS 2021 Feihu Huang, Junyi Li, Heng Huang

To fill this gap, we propose a faster and universal framework of adaptive gradients (i. e., SUPER-ADAM) by introducing a universal adaptive matrix that includes most existing adaptive gradient forms.

A New Framework for Variance-Reduced Hamiltonian Monte Carlo

no code implementations9 Feb 2021 Zhengmian Hu, Feihu Huang, Heng Huang

Moreover, our HMC methods with biased gradient estimators, such as SARAH and SARGE, require $\tilde{O}(N+\sqrt{N} \kappa^2 d^{\frac{1}{2}} \varepsilon^{-1})$ gradient complexity, which has the same dependency on condition number $\kappa$ and dimension $d$ as full gradient method, but improves the dependency of sample size $N$ for a factor of $N^\frac{1}{2}$.

Model Compression via Hyper-Structure Network

no code implementations1 Jan 2021 Shangqian Gao, Feihu Huang, Heng Huang

In this paper, we propose a novel channel pruning method to solve the problem of compression and acceleration of Convolutional Neural Networks (CNNs).

Model Compression

Gradient Descent Ascent for Minimax Problems on Riemannian Manifolds

no code implementations13 Oct 2020 Feihu Huang, Shangqian Gao

At the same time, we present an effective Riemannian stochastic gradient descent ascent (RSGDA) algorithm for the stochastic minimax optimization, which has a sample complexity of $O(\kappa^4\epsilon^{-4})$ for finding an $\epsilon$-stationary solution.

Accelerated Zeroth-Order and First-Order Momentum Methods from Mini to Minimax Optimization

no code implementations18 Aug 2020 Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Our Acc-MDA achieves a low gradient complexity of $\tilde{O}(\kappa_y^{4. 5}\epsilon^{-3})$ without requiring large batches for finding an $\epsilon$-stationary point.

Adversarial Attack

Faster Stochastic Alternating Direction Method of Multipliers for Nonconvex Optimization

no code implementations4 Aug 2020 Feihu Huang, Songcan Chen, Heng Huang

Our theoretical analysis shows that the online SPIDER-ADMM has the IFO complexity of $\mathcal{O}(\epsilon^{-\frac{3}{2}})$, which improves the existing best results by a factor of $\mathcal{O}(\epsilon^{-\frac{1}{2}})$.

Accelerated Stochastic Gradient-free and Projection-free Methods

1 code implementation ICML 2020 Feihu Huang, Lue Tao, Songcan Chen

To relax the large batches required in the Acc-SZOFW, we further propose a novel accelerated stochastic zeroth-order Frank-Wolfe (Acc-SZOFW*) based on a new variance reduced technique of STORM, which still reaches the function query complexity of $O(d\epsilon^{-3})$ in the stochastic problem without relying on any large batches.

Adversarial Attack

Momentum-Based Policy Gradient Methods

1 code implementation ICML 2020 Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

In particular, we present a non-adaptive version of IS-MBPG method, i. e., IS-MBPG*, which also reaches the best known sample complexity of $O(\epsilon^{-3})$ without any large batches.

Policy Gradient Methods

Nonconvex Zeroth-Order Stochastic ADMM Methods with Lower Function Query Complexity

no code implementations30 Jul 2019 Feihu Huang, Shangqian Gao, Jian Pei, Heng Huang

Zeroth-order (a. k. a, derivative-free) methods are a class of effective optimization methods for solving complex machine learning problems, where gradients of the objective functions are not available or computationally prohibitive.

Adversarial Attack

Zeroth-Order Stochastic Alternating Direction Method of Multipliers for Nonconvex Nonsmooth Optimization

no code implementations29 May 2019 Feihu Huang, Shangqian Gao, Songcan Chen, Heng Huang

In particular, our methods not only reach the best convergence rate $O(1/T)$ for the nonconvex optimization, but also are able to effectively solve many complex machine learning problems with multiple regularized penalties and constraints.

Adversarial Attack BIG-bench Machine Learning +1

Faster Gradient-Free Proximal Stochastic Methods for Nonconvex Nonsmooth Optimization

no code implementations16 Feb 2019 Feihu Huang, Bin Gu, Zhouyuan Huo, Songcan Chen, Heng Huang

Proximal gradient method has been playing an important role to solve many machine learning tasks, especially for the nonsmooth problems.

BIG-bench Machine Learning

Mini-Batch Stochastic ADMMs for Nonconvex Nonsmooth Optimization

no code implementations8 Feb 2018 Feihu Huang, Songcan Chen

Moreover, we extend the mini-batch stochastic gradient method to both the nonconvex SVRG-ADMM and SAGA-ADMM proposed in our initial manuscript \cite{huang2016stochastic}, and prove these mini-batch stochastic ADMMs also reaches the convergence rate of $O(1/T)$ without condition on the mini-batch size.

Linear Convergence of Accelerated Stochastic Gradient Descent for Nonconvex Nonsmooth Optimization

no code implementations26 Apr 2017 Feihu Huang, Songcan Chen

To the best of our knowledge, it is first proved that the accelerated SGD method converges linearly to the local minimum of the nonconvex optimization.

Stochastic Alternating Direction Method of Multipliers with Variance Reduction for Nonconvex Optimization

no code implementations10 Oct 2016 Feihu Huang, Songcan Chen, Zhaosong Lu

Specifically, the first class called the nonconvex stochastic variance reduced gradient ADMM (SVRG-ADMM), uses a multi-stage scheme to progressively reduce the variance of stochastic gradients.

Cannot find the paper you are looking for? You can Submit a new open access paper.