Search Results for author: Yaoyu Zhang

Found 26 papers, 6 papers with code

Structure and Gradient Dynamics Near Global Minima of Two-layer Neural Networks

no code implementations • 1 Sep 2023 • Leyang Zhang, Yaoyu Zhang, Tao Luo

Under mild assumptions, we investigate the structure of loss landscape of two-layer neural networks near global minima, determine the set of parameters which give perfect generalization, and fully characterize the gradient flows around it.

Paper
Add Code

Optimistic Estimate Uncovers the Potential of Nonlinear Models

no code implementations • 18 Jul 2023 • Yaoyu Zhang, Zhongwang Zhang, Leyang Zhang, Zhiwei Bai, Tao Luo, Zhi-Qin John Xu

We propose an optimistic estimate to evaluate the best possible fitting performance of nonlinear models.

Paper
Add Code

Linear Stability Hypothesis and Rank Stratification for Nonlinear Models

no code implementations • 21 Nov 2022 • Yaoyu Zhang, Zhongwang Zhang, Leyang Zhang, Zhiwei Bai, Tao Luo, Zhi-Qin John Xu

By these results, model rank of a target function predicts a minimal training data size for its successful recovery.

Paper
Add Code

Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks

no code implementations • 26 May 2022 • Zhiwei Bai, Tao Luo, Zhi-Qin John Xu, Yaoyu Zhang

Regarding the easy training of deep networks, we show that local minimum of an NN can be lifted to strict saddle points of a deeper NN.

Paper
Add Code

Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width

no code implementations • 24 May 2022 • Hanxu Zhou, Qixuan Zhou, Zhenyuan Jin, Tao Luo, Yaoyu Zhang, Zhi-Qin John Xu

Through experiments under three-layer condition, our phase diagram suggests a complicated dynamical regimes consisting of three possible regimes, together with their mixture, for deep NNs and provides a guidance for studying deep NNs in different initialization regimes, which reveals the possibility of completely different dynamics emerging within a deep NN for its different layers.

Paper
Add Code

Limitation of Characterizing Implicit Regularization by Data-independent Functions

no code implementations • 28 Jan 2022 • Leyang Zhang, Zhi-Qin John Xu, Tao Luo, Yaoyu Zhang

In recent years, understanding the implicit regularization of neural networks (NNs) has become a central task in deep learning theory.

Learning Theory

Paper
Add Code

Overview frequency principle/spectral bias in deep learning

no code implementations • 19 Jan 2022 • Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo

This low-frequency implicit bias reveals the strength of neural network in learning low-frequency functions as well as its deficiency in learning high-frequency functions.

Paper
Add Code

A multi-scale sampling method for accurate and robust deep neural network to predict combustion chemical kinetics

no code implementations • 9 Jan 2022 • Tianhan Zhang, Yuxiao Yi, Yifan Xu, Zhi X. Chen, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu

The current work aims to understand two basic questions regarding the deep neural network (DNN) method: what data the DNN needs and how general the DNN method can be.

Paper
Add Code

A deep learning-based model reduction (DeePMR) method for simplifying chemical kinetics

no code implementations • 6 Jan 2022 • Zhiwei Wang, Yaoyu Zhang, Enhan Zhao, Yiguang Ju, Weinan E, Zhi-Qin John Xu, Tianhan Zhang

The mechanism reduction is modeled as an optimization problem on Boolean space, where a Boolean vector, each entry corresponding to a species, represents a reduced mechanism.

Paper
Add Code

Embedding Principle: a hierarchical structure of loss landscape of deep neural networks

no code implementations • 30 Nov 2021 • Yaoyu Zhang, Yuqing Li, Zhongwang Zhang, Tao Luo, Zhi-Qin John Xu

We prove a general Embedding Principle of loss landscape of deep neural networks (NNs) that unravels a hierarchical structure of the loss landscape of NNs, i. e., loss landscape of an NN contains all critical points of all the narrower NNs.

Paper
Add Code

Data-informed Deep Optimization

no code implementations • 17 Jul 2021 • Lulu Zhang, Zhi-Qin John Xu, Yaoyu Zhang

Complex design problems are common in the scientific and industrial fields.

Paper
Add Code

MOD-Net: A Machine Learning Approach via Model-Operator-Data Network for Solving PDEs

no code implementations • 8 Jul 2021 • Lulu Zhang, Tao Luo, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu, Zheng Ma

In this paper, we propose a a machine learning approach via model-operator-data network (MOD-Net) for solving PDEs.

Paper
Add Code

Embedding Principle of Loss Landscape of Deep Neural Networks

no code implementations • NeurIPS 2021 • Yaoyu Zhang, Zhongwang Zhang, Tao Luo, Zhi-Qin John Xu

Understanding the structure of loss landscape of deep neural networks (DNNs)is obviously important.

Protein Folding

Paper
Add Code

An Upper Limit of Decaying Rate with Respect to Frequency in Deep Neural Network

no code implementations • 25 May 2021 • Tao Luo, Zheng Ma, Zhiwei Wang, Zhi-Qin John Xu, Yaoyu Zhang

frequency in DNN training.

Paper
Add Code

Towards Understanding the Condensation of Neural Networks at Initial Training

no code implementations • 25 May 2021 • Hanxu Zhou, Qixuan Zhou, Tao Luo, Yaoyu Zhang, Zhi-Qin John Xu

Our theoretical analysis confirms experiments for two cases, one is for the activation function of multiplicity one with arbitrary dimension input, which contains many common activation functions, and the other is for the layer with one-dimensional input and arbitrary multiplicity.

Paper
Add Code

Linear Frequency Principle Model to Understand the Absence of Overfitting in Neural Networks

no code implementations • 30 Jan 2021 • Yaoyu Zhang, Tao Luo, Zheng Ma, Zhi-Qin John Xu

Why heavily parameterized neural networks (NNs) do not overfit the data is an important long standing open question.

Open-Ended Question Answering

Paper
Add Code

Fourier-domain Variational Formulation and Its Well-posedness for Supervised Learning

no code implementations • 6 Dec 2020 • Tao Luo, Zheng Ma, Zhiwei Wang, Zhi-Qin John Xu, Yaoyu Zhang

A supervised learning problem is to find a function in a hypothesis function space given values on isolated data points.

Paper
Add Code

A deep learning-based ODE solver for chemical kinetics

no code implementations • 24 Nov 2020 • Tianhan Zhang, Yaoyu Zhang, Weinan E, Yiguang Ju

Besides, the ignition delay time differences are within 1%.

Paper
Add Code

On the exact computation of linear frequency principle dynamics and its generalization

1 code implementation • 15 Oct 2020 • Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang

Recent works show an intriguing phenomenon of Frequency Principle (F-Principle) that deep neural networks (DNNs) fit the target function from low to high frequency during the training, which provides insight into the training and generalization behavior of DNNs in complex tasks.

Paper
Code

Phase diagram for two-layer ReLU neural networks at infinite-width limit

1 code implementation • 15 Jul 2020 • Tao Luo, Zhi-Qin John Xu, Zheng Ma, Yaoyu Zhang

In this work, inspired by the phase diagram in statistical mechanics, we draw the phase diagram for the two-layer ReLU neural network at the infinite-width limit for a complete characterization of its dynamical regimes and their dependence on hyperparameters related to initialization.

Paper
Code

A priori generalization error for two-layer ReLU neural network through minimum norm solution

no code implementations • 6 Dec 2019 • Zhi-Qin John Xu, Jiwei Zhang, Yaoyu Zhang, Chengchao Zhao

We first estimate \emph{a priori} generalization error of finite-width two-layer ReLU NN with constraint of minimal norm solution, which is proved by \cite{zhang2019type} to be an equivalent solution of a linearized (w. r. t.

Paper
Add Code

Theory of the Frequency Principle for General Deep Neural Networks

1 code implementation • 21 Jun 2019 • Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang

Along with fruitful applications of Deep Neural Networks (DNNs) to realistic problems, recently, some empirical studies of DNNs reported a universal phenomenon of Frequency Principle (F-Principle): a DNN tends to learn a target function from low to high frequencies during the training.

Paper
Code

Explicitizing an Implicit Bias of the Frequency Principle in Two-layer Neural Networks

1 code implementation • 24 May 2019 • Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

It remains a puzzle that why deep neural networks (DNNs), with more parameters than samples, often generalize well.

Paper
Code

A type of generalization error induced by initialization in deep neural networks

no code implementations • 19 May 2019 • Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

Overall, our work serves as a baseline for the further investigation of the impact of initialization and loss function on the generalization of DNNs, which can potentially guide and improve the training of DNNs in practice.

Paper
Add Code

Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks

3 code implementations • 19 Jan 2019 • Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo, Yanyang Xiao, Zheng Ma

We study the training process of Deep Neural Networks (DNNs) from the Fourier analysis perspective.

Paper
Code

Training behavior of deep neural network in frequency domain

1 code implementation • 3 Jul 2018 • Zhi-Qin John Xu, Yaoyu Zhang, Yanyang Xiao

Why deep neural networks (DNNs) capable of overfitting often generalize well in practice is a mystery [#zhang2016understanding].

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.