no code implementations • 28 Aug 2023 • Zhao Song, Junze Yin, Lichen Zhang
Given an input matrix $A\in \mathbb{R}^{n\times d}$ with $n\gg d$ and a response vector $b$, we first consider the matrix exponential of the matrix $A^\top A$ as a proxy, and we in turn design algorithms for two types of regression problems: $\min_{x\in \mathbb{R}^d}\|(A^\top A)^jx-b\|_2$ and $\min_{x\in \mathbb{R}^d}\|A(A^\top A)^jx-b\|_2$ for any positive integer $j$.
no code implementations • 15 Jul 2023 • Yuzhou Gu, Zhao Song, Lichen Zhang
Consequently, we obtain a variety of results for SVMs: * For linear SVM, where the quadratic constraint matrix has treewidth $\tau$, we can solve the corresponding program in time $\widetilde O(n\tau^{(\omega+1)/2}\log(1/\epsilon))$; * For linear SVM, where the quadratic constraint matrix admits a low-rank factorization of rank-$k$, we can solve the corresponding program in time $\widetilde O(nk^{(\omega+1)/2}\log(1/\epsilon))$; * For Gaussian kernel SVM, where the data dimension $d = \Theta(\log n)$ and the squared dataset radius is small, we can solve it in time $O(n^{1+o(1)}\log(1/\epsilon))$.
no code implementations • 7 Jun 2023 • Zhao Song, Mingquan Ye, Junze Yin, Lichen Zhang
For weighted low rank approximation, this improves the runtime of [LLR16] from $n^2 k^2$ to $n^2k$.
no code implementations • 21 Feb 2023 • Yuzhou Gu, Zhao Song, Junze Yin, Lichen Zhang
Moreover, our algorithm runs in time $\widetilde O(|\Omega| k)$, which is nearly linear in the time to verify the solution while preserving the sample complexity.
no code implementations • 1 Feb 2023 • Zhao Song, Mingquan Ye, Junze Yin, Lichen Zhang
One popular approach for solving such $\ell_2$ regression problem is via sketching: picking a structured random matrix $S\in \mathbb{R}^{m\times n}$ with $m\ll n$ and $SA$ can be quickly computed, solve the ``sketched'' regression problem $\arg\min_{x\in \mathbb{R}^d} \|SAx-Sb\|_2$.
no code implementations • 15 Oct 2022 • Zhao Song, Yitan Wang, Zheng Yu, Lichen Zhang
In this paper, we propose a novel sketching scheme for the first order method in large-scale distributed learning setting, such that the communication costs between distributed agents are saved while the convergence of the algorithms is still guaranteed.
no code implementations • 8 Oct 2022 • Aravind Reddy, Zhao Song, Lichen Zhang
In this work, we initiate the study of \emph{Dynamic Tensor Product Regression}.
no code implementations • 14 Dec 2021 • Zhao Song, Lichen Zhang, Ruizhe Zhang
We consider the problem of training a multi-layer over-parametrized neural network to minimize the empirical risk induced by a loss function.
no code implementations • 29 Sep 2021 • Zhao Song, Zheng Yu, Lichen Zhang
Though most federated learning frameworks only require clients and the server to send gradient information over the network, they still face the challenges of communication efficiency and data privacy.
no code implementations • 21 Aug 2021 • Zhao Song, David P. Woodruff, Zheng Yu, Lichen Zhang
Recent techniques in oblivious sketching reduce the dependence in the running time on the degree $q$ of the polynomial kernel from exponential to polynomial, which is useful for the Gaussian kernel, for which $q$ can be chosen to be polylogarithmic.