no code implementations • 24 Feb 2022 • Zhiying Fang, Yidong Ouyang, Ding-Xuan Zhou, Guang Cheng
In this work, we show that with suitable adaptations, the single-head self-attention transformer with a fixed number of transformer encoder blocks and free parameters is able to generate any desired polynomial of the input with no error.
no code implementations • 5 Dec 2021 • Han Feng, Shao-Bo Lin, Ding-Xuan Zhou
This paper proposes a distributed weighted regularized least squares algorithm (DWRLS) based on spherical radial basis functions and spherical quadrature rules to tackle spherical data that are stored across numerous local servers and cannot be shared with each other.
no code implementations • 28 Nov 2021 • Shao-Bo Lin, Yao Wang, Ding-Xuan Zhou
In this paper, we study the generalization performance of global minima for implementing empirical risk minimization (ERM) on over-parameterized deep ReLU nets.
no code implementations • 2 Jul 2021 • Tong Mao, Zhongjie Shi, Ding-Xuan Zhou
We consider a family of deep neural networks consisting of two groups of convolutional layers, a downsampling operator, and a fully connected layer.
no code implementations • 23 Jun 2021 • Shao-Bo Lin, Kaidong Wang, Yao Wang, Ding-Xuan Zhou
Compared with avid research activities of deep convolutional neural networks (DCNNs) in practice, the study of theoretical behaviors of DCNNs lags heavily behind.
no code implementations • 21 Apr 2021 • Zhan Yu, Daniel W. C. Ho, Ding-Xuan Zhou
Regularization schemes for regression have been widely studied in learning theory and inverse problems.
no code implementations • 21 Jan 2021 • Jinshan Zeng, Wotao Yin, Ding-Xuan Zhou
We modify ALM to use a Moreau envelope of the augmented Lagrangian and establish its convergence under conditions that are weaker than those in the literature.
Optimization and Control
no code implementations • 28 Jul 2020 • Zhiying Fang, Han Feng, Shuo Huang, Ding-Xuan Zhou
Deep learning based on deep neural networks of various structures and architectures has been powerful in many practical applications, but it lacks enough theoretical verifications.
no code implementations • 1 Apr 2020 • Zhi Han, Siquan Yu, Shao-Bo Lin, Ding-Xuan Zhou
One of the most important challenge of deep learning is to figure out relations between a feature and the depth of deep neural networks (deep nets for short) to reflect the necessity of depth.
no code implementations • 27 Mar 2020 • Shao-Bo Lin, Di Wang, Ding-Xuan Zhou
This paper focuses on generalization performance analysis for distributed algorithms in the framework of learning theory.
no code implementations • 16 Dec 2019 • Charles K. Chui, Shao-Bo Lin, Bo Zhang, Ding-Xuan Zhou
The great success of deep learning poses urgent challenges for understanding its working mechanism and rationality.
no code implementations • 3 Dec 2019 • Yuan Cao, Zhiying Fang, Yue Wu, Ding-Xuan Zhou, Quanquan Gu
An intriguing phenomenon observed during training neural networks is the spectral bias, which states that neural networks are biased towards learning less complex functions.
no code implementations • NeurIPS 2019 • Yunwen Lei, Peng Yang, Ke Tang, Ding-Xuan Zhou
In this paper, we propose a theoretically sound strategy to select an individual iterate of the vanilla SCMD, which is able to achieve optimal rates for both convex and strongly convex problems in a non-smooth learning setting.
1 code implementation • 24 Nov 2019 • Jinshan Zeng, Minrun Wu, Shao-Bo Lin, Ding-Xuan Zhou
In the era of big data, it is highly desired to develop efficient machine learning algorithms to tackle massive data challenges such as storage bottleneck, algorithmic scalability, and interpretability.
no code implementations • 6 Oct 2019 • Shao-Bo Lin, Yu Guang Wang, Ding-Xuan Zhou
This paper develops distributed filtered hyperinterpolation for noisy data on the sphere, which assigns the data fitting task to multiple servers to find a good approximation of the mapping of input and output data.
no code implementations • 3 Apr 2019 • Charles K. Chui, Shao-Bo Lin, Ding-Xuan Zhou
Based on the tree architecture, the objective of this paper is to design deep neural networks with two or more hidden layers (called deep nets) for realization of radial functions so as to enable rotational invariance for near-optimal function approximation in an arbitrarily high dimensional Euclidian space.
1 code implementation • 6 Feb 2019 • Jinshan Zeng, Shao-Bo Lin, Yuan YAO, Ding-Xuan Zhou
In this paper, we develop an alternating direction method of multipliers (ADMM) for deep neural networks training with sigmoid-type activation functions (called \textit{sigmoid-ADMM pair}), mainly motivated by the gradient-free nature of ADMM in avoiding the saturation of sigmoid-type activations and the advantages of deep neural networks with sigmoid-type activations (called deep sigmoid nets) over their rectified linear unit (ReLU) counterparts (called deep ReLU nets) in terms of approximation.
no code implementations • 28 May 2018 • Ding-Xuan Zhou
Deep learning has been widely applied and brought breakthroughs in speech recognition, computer vision, and many other domains.
no code implementations • 9 Mar 2018 • Charles K. Chui, Shao-Bo Lin, Ding-Xuan Zhou
The subject of deep learning has recently attracted users of machine learning from various disciplines, including: medical diagnosis and bioinformatics, financial market analysis and online advertisement, speech and handwriting recognition, computer vision and natural language processing, time series forecasting, and search engines.
no code implementations • 18 Feb 2018 • Yunwen Lei, Ding-Xuan Zhou
The condition is $\lim_{t\to\infty}\eta_t=0, \sum_{t=1}^{\infty}\eta_t=\infty$ in the case of positive variances.
no code implementations • 22 Sep 2017 • Andreas Christmann, Dao-Hong Xiang, Ding-Xuan Zhou
However, the actually used kernel often depends on one or on a few hyperparameters or the kernel is even data dependent in a much more complicated manner.
no code implementations • 29 Jun 2017 • Yunwen Lei, Urun Dogan, Ding-Xuan Zhou, Marius Kloft
In this paper, we study data-dependent generalization error bounds exhibiting a mild dependency on the number of classes, making them suitable for multi-class learning with a large number of label classes.
no code implementations • 11 Aug 2016 • Shao-Bo Lin, Xin Guo, Ding-Xuan Zhou
We study distributed learning with the least squares regularization scheme in a reproducing kernel Hilbert space (RKHS).
no code implementations • 12 Oct 2015 • Andreas Christmann, Ding-Xuan Zhou
Regularized empirical risk minimization including support vector machines plays an important role in machine learning theory.
no code implementations • 31 Mar 2015 • Junhong Lin, Lorenzo Rosasco, Ding-Xuan Zhou
We consider the problem of supervised learning with convex loss functions and propose a new form of iterative regularization based on the subgradient method.
no code implementations • 10 Mar 2015 • Ming Yuan, Ding-Xuan Zhou
We establish minimax optimal rates of convergence for estimation in a high dimensional additive model assuming that it is approximately sparse.
no code implementations • 2 Mar 2015 • Yiming Ying, Ding-Xuan Zhou
Firstly, we derive explicit convergence rates of the unregularized online learning algorithms for classification associated with a general gamma-activating loss (see Definition 1 in the paper).
no code implementations • 25 Feb 2015 • Yiming Ying, Ding-Xuan Zhou
In this paper, we study an online algorithm for pairwise learning with a least-square loss function in an unconstrained setting of a reproducing kernel Hilbert space (RKHS), which we refer to as the Online Pairwise lEaRning Algorithm (OPERA).
no code implementations • 17 Dec 2014 • Jun Fan, Ting Hu, Qiang Wu, Ding-Xuan Zhou
The error entropy consistency, which requires the error entropy of the learned function to approximate the minimum error entropy, is shown to be always true if the bandwidth parameter tends to 0 at an appropriate rate.
no code implementations • 14 May 2014 • Andreas Christmann, Ding-Xuan Zhou
Additive models play an important role in semiparametric statistics.