no code implementations • 19 Mar 2024 • Yongtao Wu, Fanghui Liu, Carl-Johann Simon-Gabriel, Grigorios G Chrysos, Volkan Cevher
Recent developments in neural architecture search (NAS) emphasize the significance of considering robust architectures against malicious data.
no code implementations • 14 Mar 2024 • Yihang Chen, Fanghui Liu, Yiping Lu, Grigorios G. Chrysos, Volkan Cevher
To derive the generalization bounds under this setting, our analysis necessitates a shift from the conventional time-invariant Gram matrix employed in the lazy training regime to a time-variant, distribution-dependent version.
no code implementations • 24 Jan 2024 • Zhongjie Shi, Fanghui Liu, Yuan Cao, Johan A. K. Suykens
Adversarial training is a widely used method to improve the robustness of deep neural networks (DNNs) over adversarial perturbations.
1 code implementation • 21 Jan 2024 • Elias Abad Rocamora, Fanghui Liu, Grigorios G. Chrysos, Pablo M. Olmos, Volkan Cevher
Our regularization term can be theoretically linked to curvature of the loss function and is computationally cheaper than previous methods by avoiding Double Backpropagation.
no code implementations • 8 Jun 2023 • Ali Ramezani-Kebrya, Fanghui Liu, Thomas Pethick, Grigorios Chrysos, Volkan Cevher
This paper addresses intra-client and inter-client covariate shifts in federated learning (FL) with a focus on the overall generalization performance.
no code implementations • 30 May 2023 • Zhenyu Zhu, Fanghui Liu, Grigorios G Chrysos, Francesco Locatello, Volkan Cevher
This paper focuses on over-parameterized deep neural networks (DNNs) with ReLU activation functions and proves that when the data distribution is well-separated, DNNs can achieve Bayes-optimal test error for classification while obtaining (nearly) zero-training error under the lazy training regime.
no code implementations • 25 Apr 2023 • Fanghui Liu, Luca Viano, Volkan Cevher
In online reinforcement learning (RL), instead of employing standard structural assumptions on Markov decision processes (MDPs), using a certain coverage condition (original from offline RL) is enough to ensure sample-efficient guarantees (Xie et al. 2023).
no code implementations • 18 Sep 2022 • Mingzhen He, Fan He, Fanghui Liu, Xiaolin Huang
The theoretical foundation of RFFs is based on the Bochner theorem that relates symmetric, positive definite (PD) functions to probability measures.
no code implementations • 16 Sep 2022 • Yongtao Wu, Zhenyu Zhu, Fanghui Liu, Grigorios G Chrysos, Volkan Cevher
Neural tangent kernel (NTK) is a powerful tool to analyze training dynamics of neural networks and their generalization bounds.
no code implementations • 15 Sep 2022 • Zhenyu Zhu, Fanghui Liu, Grigorios G Chrysos, Volkan Cevher
To this end, we derive the lower (and upper) bounds of the minimum eigenvalue of the Neural Tangent Kernel (NTK) under the (in)finite-width regime using a certain search space including mixed activation functions, fully connected, and residual neural networks.
no code implementations • 15 Sep 2022 • Fanghui Liu, Luca Viano, Volkan Cevher
To be specific, we focus on the value based algorithm with the $\epsilon$-greedy exploration via deep (and two-layer) neural networks endowed by Besov (and Barron) function spaces, respectively, which aims at approximating an $\alpha$-smooth Q-function in a $d$-dimensional feature space.
1 code implementation • 15 Sep 2022 • Elias Abad Rocamora, Mehmet Fatih Sahin, Fanghui Liu, Grigorios G Chrysos, Volkan Cevher
Polynomial Networks (PNs) have demonstrated promising performance on face and image recognition recently.
no code implementations • 15 Sep 2022 • Zhenyu Zhu, Fanghui Liu, Grigorios G Chrysos, Volkan Cevher
In particular, when initialized with LeCun initialization, depth helps robustness with the lazy training regime.
no code implementations • 13 Oct 2021 • Fanghui Liu, Johan A. K. Suykens, Volkan Cevher
We study generalization properties of random features (RF) regression in high dimensions optimized by stochastic gradient descent (SGD) in under-/over-parameterized regime.
no code implementations • 3 Nov 2020 • Fanghui Liu, Xiaolin Huang, Yudong Chen, Johan A. K. Suykens
In this paper, we develop a quadrature framework for large-scale kernel machines via a numerical integration representation.
no code implementations • 6 Oct 2020 • Fanghui Liu, Zhenyu Liao, Johan A. K. Suykens
In this paper, we provide a precise characterization of generalization properties of high dimensional kernel ridge regression across the under- and over-parameterized regimes, depending on whether the number of training data n exceeds the feature dimension d. By establishing a bias-variance decomposition of the expected excess risk, we show that, while the bias is (almost) independent of d and monotonically decreases with n, the variance depends on n, d and can be unimodal or monotonically decreasing under different regularization schemes.
1 code implementation • 10 Sep 2020 • Kun Fang, Fanghui Liu, Xiaolin Huang, Jie Yang
In the second-stage process, a linear learner is conducted with respect to the mapped random features.
no code implementations • 1 Jun 2020 • Fanghui Liu, Lei Shi, Xiaolin Huang, Jie Yang, Johan A. K. Suykens
In this paper, we study the asymptotic properties of regularized least squares with indefinite kernels in reproducing kernel Krein spaces (RKKS).
no code implementations • 30 May 2020 • Fanghui Liu, Xiaolin Huang, Yingyi Chen, Johan A. K. Suykens
In this paper, we attempt to solve a long-lasting open question for non-positive definite (non-PD) kernels in machine learning community: can a given non-PD kernel be decomposed into the difference of two PD kernels (termed as positive decomposition)?
no code implementations • 23 Apr 2020 • Fanghui Liu, Xiaolin Huang, Yudong Chen, Johan A. K. Suykens
This survey may serve as a gentle introduction to this topic, and as a users' guide for practitioners interested in applying the representative algorithms and understanding theoretical results under various technical assumptions.
no code implementations • 20 Nov 2019 • Fanghui Liu, Xiaolin Huang, Yudong Chen, Jie Yang, Johan A. K. Suykens
In this paper, we propose a fast surrogate leverage weighted sampling strategy to generate refined random Fourier features for kernel approximation.
no code implementations • 7 Oct 2019 • Jiaxuan Xie, Fanghui Liu, Kaijie Wang, Xiaolin Huang
On small datasets (less than 1000 samples), for which deep learning is generally not suitable due to overfitting, our method achieves superior performance compared to advanced kernel methods.
no code implementations • 15 Apr 2019 • Fanghui Liu, Chen Gong, Xiaolin Huang, Tao Zhou, Jie Yang, DaCheng Tao
In this paper, we propose a novel matching based tracker by investigating the relationship between template matching and the recent popular correlation filter based trackers (CFTs).
no code implementations • 26 Sep 2018 • Fanghui Liu, Lei Shi, Xiaolin Huang, Jie Yang, Johan A. K. Suykens
This paper generalizes regularized regression problems in a hyper-reproducing kernel Hilbert space (hyper-RKHS), illustrates its utility for kernel learning and out-of-sample extensions, and proves asymptotic convergence results for the introduced regression models in an approximation theory view.
no code implementations • 31 Aug 2018 • Fanghui Liu, Xiaolin Huang, Chen Gong, Jie Yang, Li Li
Learning this data-adaptive matrix in a formulation-free strategy enlarges the margin between classes and thus improves the model flexibility.
no code implementations • 6 Jul 2017 • Fanghui Liu, Xiaolin Huang, Chen Gong, Jie Yang, Johan A. K. Suykens
Since the concave-convex procedure has to solve a sub-problem in each iteration, we propose a concave-inexact-convex procedure (CCICP) algorithm with an inexact solving scheme to accelerate the solving process.
no code implementations • 5 Oct 2015 • Fanghui Liu, Tao Zhou, Irene Y. H. Gu, Jie Yang
Weights of these various dictionaries are also learned from approximated LLC in the similar framework.
no code implementations • 20 Sep 2015 • Fanghui Liu, Tao Zhou, Keren Fu, Irene Y. H. Gu, Jie Yang
It utilizes both the foreground and background information, and imposes a local coordinate constraint, where the basis matrix is sparse matrix from the linear combination of candidates with corresponding nonnegative coefficient vectors.