Search Results for author: Haishan Ye

Found 22 papers, 2 papers with code

Second-Order Fine-Tuning without Pain for LLMs:A Hessian Informed Zeroth-Order Optimizer

no code implementations23 Feb 2024 Yanjun Zhao, Sizhe Dang, Haishan Ye, Guang Dai, Yi Qian, Ivor W. Tsang

Fine-tuning large language models (LLMs) with classic first-order optimizers entails prohibitive GPU memory due to the backpropagation process.

PPFL: A Personalized Federated Learning Framework for Heterogeneous Population

no code implementations22 Oct 2023 Hao Di, Yi Yang, Haishan Ye, Xiangyu Chang

Personalization aims to characterize individual preferences and is widely applied across many fields.

Personalized Federated Learning

Decentralized Riemannian Conjugate Gradient Method on the Stiefel Manifold

no code implementations21 Aug 2023 Jun Chen, Haishan Ye, Mengmeng Wang, Tianxin Huang, Guang Dai, Ivor W. Tsang, Yong liu

This paper proposes a decentralized Riemannian conjugate gradient descent (DRCGD) method that aims at minimizing a global function over the Stiefel manifold.

Second-order methods

Mirror Natural Evolution Strategies

no code implementations1 Aug 2023 Haishan Ye

In this paper, we focus on the theory of zeroth-order optimization which utilizes both the first-order and second-order information approximated by the zeroth-order queries.

An Efficient Stochastic Algorithm for Decentralized Nonconvex-Strongly-Concave Minimax Optimization

no code implementations5 Dec 2022 Lesi Chen, Haishan Ye, Luo Luo

This paper studies the stochastic optimization for decentralized nonconvex-strongly-concave (NC-SC) minimax problems over a multi-agent network.

Stochastic Optimization

An Optimal Stochastic Algorithm for Decentralized Nonconvex Finite-sum Optimization

no code implementations25 Oct 2022 Luo Luo, Haishan Ye

This paper studies the decentralized nonconvex optimization problem $\min_{x\in{\mathbb R}^d} f(x)\triangleq \frac{1}{m}\sum_{i=1}^m f_i(x)$, where $f_i(x)\triangleq \frac{1}{n}\sum_{j=1}^n f_{i, j}(x)$ is the local function on the $i$-th agent of the network.

Decentralized Stochastic Variance Reduced Extragradient Method

no code implementations1 Feb 2022 Luo Luo, Haishan Ye

This paper studies decentralized convex-concave minimax optimization problems of the form $\min_x\max_y f(x, y) \triangleq\frac{1}{m}\sum_{i=1}^m f_i(x, y)$, where $m$ is the number of agents and each local function can be written as $f_i(x, y)=\frac{1}{n}\sum_{j=1}^n f_{i, j}(x, y)$.

Eigencurve: Optimal Learning Rate Schedule for SGD on Quadratic Objectives with Skewed Hessian Spectrums

1 code implementation ICLR 2022 Rui Pan, Haishan Ye, Tong Zhang

In this paper, we propose Eigencurve, the first family of learning rate schedules that can achieve minimax optimal convergence rates (up to a constant) for SGD on quadratic objectives when the eigenvalue distribution of the underlying Hessian matrix is skewed.

Image Classification

DeEPCA: Decentralized Exact PCA with Linear Convergence Rate

no code implementations8 Feb 2021 Haishan Ye, Tong Zhang

This leads to a decentralized PCA algorithm called \texttt{DeEPCA}, which has a convergence rate similar to that of the centralized PCA, while achieving the best communication complexity among existing decentralized PCA algorithms.

Decentralized Accelerated Proximal Gradient Descent

no code implementations NeurIPS 2020 Haishan Ye, Ziang Zhou, Luo Luo, Tong Zhang

In this paper, we propose a new method which establishes the optimal computational complexity and a near optimal communication complexity.

BIG-bench Machine Learning

Multi-consensus Decentralized Accelerated Gradient Descent

no code implementations2 May 2020 Haishan Ye, Luo Luo, Ziang Zhou, Tong Zhang

This paper considers the decentralized convex optimization problem, which has a wide range of applications in large-scale machine learning, sensor networks, and control theory.

BIG-bench Machine Learning

MiLeNAS: Efficient Neural Architecture Search via Mixed-Level Reformulation

1 code implementation CVPR 2020 Chaoyang He, Haishan Ye, Li Shen, Tong Zhang

To remedy this, this paper proposes \mldas, a mixed-level reformulation for NAS that can be optimized efficiently and reliably.

Bilevel Optimization Neural Architecture Search +1

Stochastic Recursive Gradient Descent Ascent for Stochastic Nonconvex-Strongly-Concave Minimax Problems

no code implementations NeurIPS 2020 Luo Luo, Haishan Ye, Zhichao Huang, Tong Zhang

We consider nonconvex-concave minimax optimization problems of the form $\min_{\bf x}\max_{\bf y\in{\mathcal Y}} f({\bf x},{\bf y})$, where $f$ is strongly-concave in $\bf y$ but possibly nonconvex in $\bf x$ and ${\mathcal Y}$ is a convex and compact set.

Mirror Natural Evolution Strategies

no code implementations25 Oct 2019 Haishan Ye, Tong Zhang

We show that the estimated covariance matrix of MiNES converges to the inverse of Hessian matrix of the objective function with a sublinear convergence rate.

Nesterov's Acceleration For Approximate Newton

no code implementations17 Oct 2017 Haishan Ye, Zhihua Zhang

Besides, the accelerated regularized sub-sampled Newton has good performance comparable to or even better than classical algorithms.

Second-order methods

Approximate Newton Methods and Their Local Convergence

no code implementations ICML 2017 Haishan Ye, Luo Luo, Zhihua Zhang

We propose a unifying framework to analyze local convergence properties of second order methods.

Second-order methods

Nestrov's Acceleration For Second Order Method

no code implementations19 May 2017 Haishan Ye, Zhihua Zhang

Besides, the accelerated regularized sub-sampled Newton has good performance comparable to or even better than state-of-art algorithms.

Second-order methods

Cannot find the paper you are looking for? You can Submit a new open access paper.