Search Results for author: Ding-Xuan Zhou

Found 30 papers, 2 papers with code

Attention Enables Zero Approximation Error

no code implementations24 Feb 2022 Zhiying Fang, Yidong Ouyang, Ding-Xuan Zhou, Guang Cheng

In this work, we show that with suitable adaptations, the single-head self-attention transformer with a fixed number of transformer encoder blocks and free parameters is able to generate any desired polynomial of the input with no error.

Image Classification

Radial Basis Function Approximation with Distributively Stored Data on Spheres

no code implementations5 Dec 2021 Han Feng, Shao-Bo Lin, Ding-Xuan Zhou

This paper proposes a distributed weighted regularized least squares algorithm (DWRLS) based on spherical radial basis functions and spherical quadrature rules to tackle spherical data that are stored across numerous local servers and cannot be shared with each other.

Generalization Performance of Empirical Risk Minimization on Over-parameterized Deep ReLU Nets

no code implementations28 Nov 2021 Shao-Bo Lin, Yao Wang, Ding-Xuan Zhou

In this paper, we study the generalization performance of global minima for implementing empirical risk minimization (ERM) on over-parameterized deep ReLU nets.

Theory of Deep Convolutional Neural Networks III: Approximating Radial Functions

no code implementations2 Jul 2021 Tong Mao, Zhongjie Shi, Ding-Xuan Zhou

We consider a family of deep neural networks consisting of two groups of convolutional layers, a downsampling operator, and a fully connected layer.

Universal Consistency of Deep Convolutional Neural Networks

no code implementations23 Jun 2021 Shao-Bo Lin, Kaidong Wang, Yao Wang, Ding-Xuan Zhou

Compared with avid research activities of deep convolutional neural networks (DCNNs) in practice, the study of theoretical behaviors of DCNNs lags heavily behind.

Robust Kernel-based Distribution Regression

no code implementations21 Apr 2021 Zhan Yu, Daniel W. C. Ho, Ding-Xuan Zhou

Regularization schemes for regression have been widely studied in learning theory and inverse problems.

Learning Theory

Moreau Envelope Augmented Lagrangian Method for Nonconvex Optimization with Linear Constraints

no code implementations21 Jan 2021 Jinshan Zeng, Wotao Yin, Ding-Xuan Zhou

We modify ALM to use a Moreau envelope of the augmented Lagrangian and establish its convergence under conditions that are weaker than those in the literature.

Optimization and Control

Theory of Deep Convolutional Neural Networks II: Spherical Analysis

no code implementations28 Jul 2020 Zhiying Fang, Han Feng, Shuo Huang, Ding-Xuan Zhou

Deep learning based on deep neural networks of various structures and architectures has been powerful in many practical applications, but it lacks enough theoretical verifications.

Depth Selection for Deep ReLU Nets in Feature Extraction and Generalization

no code implementations1 Apr 2020 Zhi Han, Siquan Yu, Shao-Bo Lin, Ding-Xuan Zhou

One of the most important challenge of deep learning is to figure out relations between a feature and the depth of deep neural networks (deep nets for short) to reflect the necessity of depth.

Feature Engineering Representation Learning

Distributed Kernel Ridge Regression with Communications

no code implementations27 Mar 2020 Shao-Bo Lin, Di Wang, Ding-Xuan Zhou

This paper focuses on generalization performance analysis for distributed algorithms in the framework of learning theory.

Learning Theory

Realization of spatial sparseness by deep ReLU nets with massive data

no code implementations16 Dec 2019 Charles K. Chui, Shao-Bo Lin, Bo Zhang, Ding-Xuan Zhou

The great success of deep learning poses urgent challenges for understanding its working mechanism and rationality.

Learning Theory

Towards Understanding the Spectral Bias of Deep Learning

no code implementations3 Dec 2019 Yuan Cao, Zhiying Fang, Yue Wu, Ding-Xuan Zhou, Quanquan Gu

An intriguing phenomenon observed during training neural networks is the spectral bias, which states that neural networks are biased towards learning less complex functions.

Optimal Stochastic and Online Learning with Individual Iterates

no code implementations NeurIPS 2019 Yunwen Lei, Peng Yang, Ke Tang, Ding-Xuan Zhou

In this paper, we propose a theoretically sound strategy to select an individual iterate of the vanilla SCMD, which is able to achieve optimal rates for both convex and strongly convex problems in a non-smooth learning setting.

online learning Sparse Learning

Fast Polynomial Kernel Classification for Massive Data

1 code implementation24 Nov 2019 Jinshan Zeng, Minrun Wu, Shao-Bo Lin, Ding-Xuan Zhou

In the era of big data, it is highly desired to develop efficient machine learning algorithms to tackle massive data challenges such as storage bottleneck, algorithmic scalability, and interpretability.

Classification General Classification

Distributed filtered hyperinterpolation for noisy data on the sphere

no code implementations6 Oct 2019 Shao-Bo Lin, Yu Guang Wang, Ding-Xuan Zhou

This paper develops distributed filtered hyperinterpolation for noisy data on the sphere, which assigns the data fitting task to multiple servers to find a good approximation of the mapping of input and output data.

Geophysics Model Selection

Deep Neural Networks for Rotation-Invariance Approximation and Learning

no code implementations3 Apr 2019 Charles K. Chui, Shao-Bo Lin, Ding-Xuan Zhou

Based on the tree architecture, the objective of this paper is to design deep neural networks with two or more hidden layers (called deep nets) for realization of radial functions so as to enable rotational invariance for near-optimal function approximation in an arbitrarily high dimensional Euclidian space.

On ADMM in Deep Learning: Convergence and Saturation-Avoidance

1 code implementation6 Feb 2019 Jinshan Zeng, Shao-Bo Lin, Yuan YAO, Ding-Xuan Zhou

In this paper, we develop an alternating direction method of multipliers (ADMM) for deep neural networks training with sigmoid-type activation functions (called \textit{sigmoid-ADMM pair}), mainly motivated by the gradient-free nature of ADMM in avoiding the saturation of sigmoid-type activations and the advantages of deep neural networks with sigmoid-type activations (called deep sigmoid nets) over their rectified linear unit (ReLU) counterparts (called deep ReLU nets) in terms of approximation.

Universality of Deep Convolutional Neural Networks

no code implementations28 May 2018 Ding-Xuan Zhou

Deep learning has been widely applied and brought breakthroughs in speech recognition, computer vision, and many other domains.

Learning Theory Speech Recognition

Construction of neural networks for realization of localized deep learning

no code implementations9 Mar 2018 Charles K. Chui, Shao-Bo Lin, Ding-Xuan Zhou

The subject of deep learning has recently attracted users of machine learning from various disciplines, including: medical diagnosis and bioinformatics, financial market analysis and online advertisement, speech and handwriting recognition, computer vision and natural language processing, time series forecasting, and search engines.

Dimensionality Reduction Handwriting Recognition +4

Convergence of Online Mirror Descent

no code implementations18 Feb 2018 Yunwen Lei, Ding-Xuan Zhou

The condition is $\lim_{t\to\infty}\eta_t=0, \sum_{t=1}^{\infty}\eta_t=\infty$ in the case of positive variances.

online learning

Total stability of kernel methods

no code implementations22 Sep 2017 Andreas Christmann, Dao-Hong Xiang, Ding-Xuan Zhou

However, the actually used kernel often depends on one or on a few hyperparameters or the kernel is even data dependent in a much more complicated manner.

Data-dependent Generalization Bounds for Multi-class Classification

no code implementations29 Jun 2017 Yunwen Lei, Urun Dogan, Ding-Xuan Zhou, Marius Kloft

In this paper, we study data-dependent generalization error bounds exhibiting a mild dependency on the number of classes, making them suitable for multi-class learning with a large number of label classes.

Classification General Classification +2

Distributed learning with regularized least squares

no code implementations11 Aug 2016 Shao-Bo Lin, Xin Guo, Ding-Xuan Zhou

We study distributed learning with the least squares regularization scheme in a reproducing kernel Hilbert space (RKHS).

On the Robustness of Regularized Pairwise Learning Methods Based on Kernels

no code implementations12 Oct 2015 Andreas Christmann, Ding-Xuan Zhou

Regularized empirical risk minimization including support vector machines plays an important role in machine learning theory.

Learning Theory

Iterative Regularization for Learning with Convex Loss Functions

no code implementations31 Mar 2015 Junhong Lin, Lorenzo Rosasco, Ding-Xuan Zhou

We consider the problem of supervised learning with convex loss functions and propose a new form of iterative regularization based on the subgradient method.

Minimax Optimal Rates of Estimation in High Dimensional Additive Models: Universal Phase Transition

no code implementations10 Mar 2015 Ming Yuan, Ding-Xuan Zhou

We establish minimax optimal rates of convergence for estimation in a high dimensional additive model assuming that it is approximately sparse.

Additive models

Unregularized Online Learning Algorithms with General Loss Functions

no code implementations2 Mar 2015 Yiming Ying, Ding-Xuan Zhou

Firstly, we derive explicit convergence rates of the unregularized online learning algorithms for classification associated with a general gamma-activating loss (see Definition 1 in the paper).

online learning

Online Pairwise Learning Algorithms with Kernels

no code implementations25 Feb 2015 Yiming Ying, Ding-Xuan Zhou

In this paper, we study an online algorithm for pairwise learning with a least-square loss function in an unconstrained setting of a reproducing kernel Hilbert space (RKHS), which we refer to as the Online Pairwise lEaRning Algorithm (OPERA).

Metric Learning

Consistency Analysis of an Empirical Minimum Error Entropy Algorithm

no code implementations17 Dec 2014 Jun Fan, Ting Hu, Qiang Wu, Ding-Xuan Zhou

The error entropy consistency, which requires the error entropy of the learned function to approximate the minimum error entropy, is shown to be always true if the bandwidth parameter tends to 0 at an appropriate rate.

Cannot find the paper you are looking for? You can Submit a new open access paper.