no code implementations • 24 Mar 2022 • Zhongping Zhang, Huiwen He, Bryan A. Plummer, Zhenyu Liao, Huayan Wang
We outperform the state-of-the-art in a quantitative and qualitative evaluation on the CLEVR and Visual Genome datasets.
no code implementations • 22 Oct 2021 • Jiachen Li, Shuo Cheng, Zhenyu Liao, Huayan Wang, William Yang Wang, Qinxun Bai
Improving the sample efficiency of reinforcement learning algorithms requires effective exploration.
no code implementations • 19 Oct 2021 • Xin Miao, Huayan Wang, Jun Fu, Jiayi Liu, Shen Wang, Zhenyu Liao
The style vectors are fed to the generator and discriminator to achieve fine-grained control.
1 code implementation • ICLR 2022 • Hafiz Tiomoko Ali, Zhenyu Liao, Romain Couillet
As a result, for any kernel matrix ${\bf K}$ of the form above, we propose a novel random features technique, called Ternary Random Feature (TRF), that (i) asymptotically yields the same limiting kernel as the original ${\bf K}$ in a spectral sense and (ii) can be computed and stored much more efficiently, by wisely tuning (in a data-dependent manner) the function $\sigma$ and the random vector ${\bf w}$, both taking values in $\{-1, 0, 1\}$.
no code implementations • NeurIPS 2021 • Zhenyu Liao, Michael W. Mahoney
Given an optimization problem, the Hessian matrix and its eigenspectrum can be used in many ways, ranging from designing more efficient second-order algorithms to performing model analysis and regression diagnostics.
no code implementations • 21 Nov 2020 • Michał Dereziński, Zhenyu Liao, Edgar Dobriban, Michael W. Mahoney
For a tall $n\times d$ matrix $A$ and a random $m\times n$ sketching matrix $S$, the sketched estimate of the inverse covariance matrix $(A^\top A)^{-1}$ is typically biased: $E[(\tilde A^\top\tilde A)^{-1}]\ne(A^\top A)^{-1}$, where $\tilde A=SA$.
no code implementations • 6 Oct 2020 • Fanghui Liu, Zhenyu Liao, Johan A. K. Suykens
In this paper, we provide a precise characterization of generalization properties of high dimensional kernel ridge regression across the under- and over-parameterized regimes, depending on whether the number of training data n exceeds the feature dimension d. By establishing a bias-variance decomposition of the expected excess risk, we show that, while the bias is (almost) independent of d and monotonically decreases with n, the variance depends on n, d and can be unimodal or monotonically decreasing under different regularization schemes.
no code implementations • ICLR 2021 • Zhenyu Liao, Romain Couillet, Michael W. Mahoney
Given a large data matrix, sparsifying, quantizing, and/or performing other entry-wise nonlinear operations can have numerous benefits, ranging from speeding up iterative algorithms for core numerical linear algebra problems to providing nonlinear filters to design state-of-the-art neural network models.
no code implementations • NeurIPS 2020 • Michał Dereziński, Feynman Liang, Zhenyu Liao, Michael W. Mahoney
It is often desirable to reduce the dimensionality of a large dataset by projecting it onto a low-dimensional subspace.
no code implementations • NeurIPS 2020 • Zhenyu Liao, Romain Couillet, Michael W. Mahoney
This article characterizes the exact asymptotics of random Fourier feature (RFF) regression, in the realistic setting where the number of data samples $n$, their dimension $p$, and the dimension of feature space $N$ are all large and comparable.
3 code implementations • 21 Dec 2019 • Qing Jin, Linjie Yang, Zhenyu Liao
To deal with this problem, we propose a simple yet effective technique, named scale-adjusted training (SAT), to comply with the discovered rules and facilitates efficient training.
1 code implementation • CVPR 2020 • Qing Jin, Linjie Yang, Zhenyu Liao
With our proposed techniques applied on a bunch of models including MobileNet-V1/V2 and ResNet-50, we demonstrate that bit-width of weights and activations is a new option for adaptively executable deep neural networks, offering a distinct opportunity for improved accuracy-efficiency trade-off as well as instant adaptation according to the platform constraints in real-world applications.
no code implementations • 25 Sep 2019 • Qing Jin, Linjie Yang, Zhenyu Liao
To deal with this problem, we propose a simple yet effective technique, named scale-adjusted training (SAT), to comply with the discovered rules and facilitates efficient training.
no code implementations • 15 Sep 2019 • Zhenyu Liao, Romain Couillet
This article investigates the eigenspectrum of the inner product-type kernel matrix $\sqrt{p} \mathbf{K}=\{f( \mathbf{x}_i^{\sf T} \mathbf{x}_j/\sqrt{p})\}_{i, j=1}^n $ under a binary mixture model in the high dimensional regime where the number of data $n$ and their dimension $p$ are both large and comparable.
no code implementations • 6 Jun 2019 • Yuexiang Zhai, Zitong Yang, Zhenyu Liao, John Wright, Yi Ma
Most existing methods solve the dictionary (and sparse representations) based on heuristic algorithms, usually without theoretical guarantees for either optimality or complexity.
no code implementations • 31 May 2019 • Xiaoyi Mai, Zhenyu Liao
Building upon this quantitative error analysis, we identify the simple square loss as the optimal choice for high dimensional classification in both ridge-regularized and unregularized cases, regardless of the number of training samples.
1 code implementation • ECCV 2020 • Yingwei Li, Song Bai, Cihang Xie, Zhenyu Liao, Xiaohui Shen, Alan L. Yuille
We observe the property of regional homogeneity in adversarial perturbations and suggest that the defenses are less robust to regionally homogeneous perturbations.
no code implementations • 8 Nov 2018 • Yacine Chitour, Zhenyu Liao, Romain Couillet
We translate a well-known empirical observation of linear neural nets into a conjecture that we call the \emph{overfitting conjecture} which states that, for almost all training data and initial conditions, the trajectory of the corresponding gradient descent system converges to a global minimum.
1 code implementation • ICML 2018 • Zhenyu Liao, Romain Couillet
Random feature maps are ubiquitous in modern statistical machine learning, where they generalize random projections by means of powerful, yet often difficult to analyze nonlinear operators.
no code implementations • ICML 2018 • Zhenyu Liao, Romain Couillet
Understanding the learning dynamics of neural networks is one of the key issues for the improvement of optimization algorithms as well as for the theoretical comprehension of why deep neural nets work so well today.
1 code implementation • 17 Feb 2017 • Cosme Louart, Zhenyu Liao, Romain Couillet
This article studies the Gram random matrix model $G=\frac1T\Sigma^{\rm T}\Sigma$, $\Sigma=\sigma(WX)$, classically found in the analysis of random feature maps and random neural networks, where $X=[x_1,\ldots, x_T]\in{\mathbb R}^{p\times T}$ is a (data) matrix of bounded norm, $W\in{\mathbb R}^{n\times p}$ is a matrix of independent zero-mean unit variance entries, and $\sigma:{\mathbb R}\to{\mathbb R}$ is a Lipschitz continuous (activation) function --- $\sigma(WX)$ being understood entry-wise.
1 code implementation • 11 Jan 2017 • Zhenyu Liao, Romain Couillet
In this article, a large dimensional performance analysis of kernel least squares support vector machines (LS-SVMs) is provided under the assumption of a two-class Gaussian mixture model for the input data.
no code implementations • 7 Sep 2016 • Zhenyu Liao, Romain Couillet
This article proposes a performance analysis of kernel least squares support vector machines (LS-SVMs) based on a random matrix approach, in the regime where both the dimension of data $p$ and their number $n$ grow large at the same rate.
no code implementations • 16 Jun 2015 • Zeyuan Allen-Zhu, Zhenyu Liao, Lorenzo Orecchia
In this paper, we provide a novel construction of the linear-sized spectral sparsifiers of Batson, Spielman and Srivastava [BSS14].