no code implementations • 23 Feb 2024 • Yanjun Zhao, Sizhe Dang, Haishan Ye, Guang Dai, Yi Qian, Ivor W. Tsang
Fine-tuning large language models (LLMs) with classic first-order optimizers entails prohibitive GPU memory due to the backpropagation process.
no code implementations • 22 Oct 2023 • Hao Di, Yi Yang, Haishan Ye, Xiangyu Chang
Personalization aims to characterize individual preferences and is widely applied across many fields.
no code implementations • 21 Aug 2023 • Jun Chen, Haishan Ye, Mengmeng Wang, Tianxin Huang, Guang Dai, Ivor W. Tsang, Yong liu
This paper proposes a decentralized Riemannian conjugate gradient descent (DRCGD) method that aims at minimizing a global function over the Stiefel manifold.
no code implementations • 1 Aug 2023 • Haishan Ye
In this paper, we focus on the theory of zeroth-order optimization which utilizes both the first-order and second-order information approximated by the zeroth-order queries.
no code implementations • 5 Dec 2022 • Lesi Chen, Haishan Ye, Luo Luo
This paper studies the stochastic optimization for decentralized nonconvex-strongly-concave (NC-SC) minimax problems over a multi-agent network.
no code implementations • 25 Oct 2022 • Luo Luo, Haishan Ye
This paper studies the decentralized nonconvex optimization problem $\min_{x\in{\mathbb R}^d} f(x)\triangleq \frac{1}{m}\sum_{i=1}^m f_i(x)$, where $f_i(x)\triangleq \frac{1}{n}\sum_{j=1}^n f_{i, j}(x)$ is the local function on the $i$-th agent of the network.
no code implementations • 1 Feb 2022 • Luo Luo, Haishan Ye
This paper studies decentralized convex-concave minimax optimization problems of the form $\min_x\max_y f(x, y) \triangleq\frac{1}{m}\sum_{i=1}^m f_i(x, y)$, where $m$ is the number of agents and each local function can be written as $f_i(x, y)=\frac{1}{n}\sum_{j=1}^n f_{i, j}(x, y)$.
no code implementations • NeurIPS 2021 • Dachao Lin, Haishan Ye, Zhihua Zhang
In this paper, we follow Rodomanov and Nesterov’s work to study quasi-Newton methods.
1 code implementation • ICLR 2022 • Rui Pan, Haishan Ye, Tong Zhang
In this paper, we propose Eigencurve, the first family of learning rate schedules that can achieve minimax optimal convergence rates (up to a constant) for SGD on quadratic objectives when the eigenvalue distribution of the underlying Hessian matrix is skewed.
no code implementations • 8 Feb 2021 • Haishan Ye, Tong Zhang
This leads to a decentralized PCA algorithm called \texttt{DeEPCA}, which has a convergence rate similar to that of the centralized PCA, while achieving the best communication complexity among existing decentralized PCA algorithms.
no code implementations • 30 Dec 2020 • Haishan Ye, Wei Xiong, Tong Zhang
This paper considers the decentralized composite optimization problem.
no code implementations • NeurIPS 2020 • Haishan Ye, Ziang Zhou, Luo Luo, Tong Zhang
In this paper, we propose a new method which establishes the optimal computational complexity and a near optimal communication complexity.
no code implementations • 5 Sep 2020 • Luo Luo, Cheng Chen, Guangzeng Xie, Haishan Ye
We study the streaming model for approximate matrix multiplication (AMM).
no code implementations • 2 May 2020 • Haishan Ye, Luo Luo, Ziang Zhou, Tong Zhang
This paper considers the decentralized convex optimization problem, which has a wide range of applications in large-scale machine learning, sensor networks, and control theory.
1 code implementation • CVPR 2020 • Chaoyang He, Haishan Ye, Li Shen, Tong Zhang
To remedy this, this paper proposes \mldas, a mixed-level reformulation for NAS that can be optimized efficiently and reliably.
no code implementations • NeurIPS 2020 • Luo Luo, Haishan Ye, Zhichao Huang, Tong Zhang
We consider nonconvex-concave minimax optimization problems of the form $\min_{\bf x}\max_{\bf y\in{\mathcal Y}} f({\bf x},{\bf y})$, where $f$ is strongly-concave in $\bf y$ but possibly nonconvex in $\bf x$ and ${\mathcal Y}$ is a convex and compact set.
no code implementations • 27 Dec 2019 • Haishan Ye, Shusen Wang, Zhihua Zhang, Tong Zhang
Fast matrix algorithms have become the fundamental tools of machine learning in big data era.
no code implementations • 25 Oct 2019 • Haishan Ye, Tong Zhang
We show that the estimated covariance matrix of MiNES converges to the inverse of Hessian matrix of the objective function with a sublinear convergence rate.
no code implementations • 29 Dec 2018 • Haishan Ye, Zhichao Huang, Cong Fang, Chris Junchi Li, Tong Zhang
Zeroth-order optimization is an important research topic in machine learning.
no code implementations • 17 Oct 2017 • Haishan Ye, Zhihua Zhang
Besides, the accelerated regularized sub-sampled Newton has good performance comparable to or even better than classical algorithms.
no code implementations • ICML 2017 • Haishan Ye, Luo Luo, Zhihua Zhang
We propose a unifying framework to analyze local convergence properties of second order methods.
no code implementations • 19 May 2017 • Haishan Ye, Zhihua Zhang
Besides, the accelerated regularized sub-sampled Newton has good performance comparable to or even better than state-of-art algorithms.