no code implementations • 3 Feb 2024 • Hanxu Zhou, Yuan Zhang, Guangjie Leng, Ruofan Wang, Zhi-Qin John Xu
Therefore, in this article, we try to re-understand and define the time series anomaly detection problem through OCC, which we call 'time series anomaly state detection problem'.
no code implementations • 16 Jan 2024 • Zhongwang Zhang, Zhiwei Wang, Junjie Yao, Zhangchen Zhou, Xiaolong Li, Weinan E, Zhi-Qin John Xu
However, language model research faces significant challenges, especially for academic research groups with constrained resources.
no code implementations • 8 Nov 2023 • Xiong-bin Yan, Keke Wu, Zhi-Qin John Xu, Zheng Ma
Full-waveform inversion (FWI) is a powerful geophysical imaging technique that infers high-resolution subsurface physical parameters by solving a non-convex optimization problem.
no code implementations • 18 Jul 2023 • Yaoyu Zhang, Zhongwang Zhang, Leyang Zhang, Zhiwei Bai, Tao Luo, Zhi-Qin John Xu
We propose an optimistic estimate to evaluate the best possible fitting performance of nonlinear models.
no code implementations • 25 May 2023 • Zhongwang Zhang, Yuqing Li, Tao Luo, Zhi-Qin John Xu
In order to investigate the underlying mechanism by which dropout facilitates the identification of flatter minima, we study the noise structure of the derived stochastic modified equation for dropout.
no code implementations • 20 May 2023 • Zhongwang Zhang, Zhi-Qin John Xu
In this work, we study the mechanism underlying loss spikes observed during neural network training.
no code implementations • 17 May 2023 • Zhangchen Zhou, Hanxu Zhou, Yuqing Li, Zhi-Qin John Xu
Previous research has shown that fully-connected networks with small initialization and gradient-based training methods exhibit a phenomenon known as condensation during training.
no code implementations • 3 Apr 2023 • Xiong-bin Yan, Zhi-Qin John Xu, Zheng Ma
To address this issue, this paper proposes an extension to PINNs called Laplace-based fractional physics-informed neural networks (Laplace-fPINNs), which can effectively solve the forward and inverse problems of fractional diffusion equations.
no code implementations • 12 Mar 2023 • Zhengan Chen, Yuqing Li, Tao Luo, Zhangchen Zhou, Zhi-Qin John Xu
The phenomenon of distinct behaviors exhibited by neural networks under varying scales of initialization remains an enigma in deep learning research.
no code implementations • 22 Nov 2022 • Xiong-bin Yan, Zhi-Qin John Xu, Zheng Ma
A large number of numerical experiments demonstrate that the operator learning method proposed in this work can efficiently solve the forward problems and Bayesian inverse problems of the subdiffusion equation.
no code implementations • 21 Nov 2022 • Yaoyu Zhang, Zhongwang Zhang, Leyang Zhang, Zhiwei Bai, Tao Luo, Zhi-Qin John Xu
By these results, model rank of a target function predicts a minimal training data size for its successful recovery.
no code implementations • 13 Jul 2022 • Zhongwang Zhang, Zhi-Qin John Xu
Secondly, we experimentally find that the training with dropout leads to the neural network with a flatter minimum compared with standard gradient descent training, and the implicit regularization is the key to finding flat solutions.
no code implementations • 26 May 2022 • Zhiwei Bai, Tao Luo, Zhi-Qin John Xu, Yaoyu Zhang
Regarding the easy training of deep networks, we show that local minimum of an NN can be lifted to strict saddle points of a deeper NN.
no code implementations • 25 May 2022 • Shuyu Yin, Tao Luo, Peilin Liu, Zhi-Qin John Xu
In this work, we perform extensive experiments to show that TD outperforms RG, that is, when the training leads to a small Bellman residual error, the solution found by TD has a better policy and is more robust against the perturbation of neural network parameters.
no code implementations • 24 May 2022 • Hanxu Zhou, Qixuan Zhou, Zhenyuan Jin, Tao Luo, Yaoyu Zhang, Zhi-Qin John Xu
Through experiments under three-layer condition, our phase diagram suggests a complicated dynamical regimes consisting of three possible regimes, together with their mixture, for deep NNs and provides a guidance for studying deep NNs in different initialization regimes, which reveals the possibility of completely different dynamics emerging within a deep NN for its different layers.
no code implementations • 28 Jan 2022 • Leyang Zhang, Zhi-Qin John Xu, Tao Luo, Yaoyu Zhang
In recent years, understanding the implicit regularization of neural networks (NNs) has become a central task in deep learning theory.
no code implementations • 19 Jan 2022 • Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo
This low-frequency implicit bias reveals the strength of neural network in learning low-frequency functions as well as its deficiency in learning high-frequency functions.
no code implementations • 9 Jan 2022 • Tianhan Zhang, Yuxiao Yi, Yifan Xu, Zhi X. Chen, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu
The current work aims to understand two basic questions regarding the deep neural network (DNN) method: what data the DNN needs and how general the DNN method can be.
no code implementations • 6 Jan 2022 • Zhiwei Wang, Yaoyu Zhang, Enhan Zhao, Yiguang Ju, Weinan E, Zhi-Qin John Xu, Tianhan Zhang
The mechanism reduction is modeled as an optimization problem on Boolean space, where a Boolean vector, each entry corresponding to a species, represents a reduced mechanism.
1 code implementation • 10 Dec 2021 • Xi-An Li, Zhi-Qin John Xu, Lei Zhang
Numerical results show that the SD$^2$NN model is superior to existing models such as MscaleDNN.
no code implementations • 30 Nov 2021 • Yaoyu Zhang, Yuqing Li, Zhongwang Zhang, Tao Luo, Zhi-Qin John Xu
We prove a general Embedding Principle of loss landscape of deep neural networks (NNs) that unravels a hierarchical structure of the loss landscape of NNs, i. e., loss landscape of an NN contains all critical points of all the narrower NNs.
no code implementations • 1 Nov 2021 • Zhongwang Zhang, Hanxu Zhou, Zhi-Qin John Xu
It is important to understand how the popular regularization method dropout helps the neural network training find a good generalization solution.
no code implementations • 17 Jul 2021 • Lulu Zhang, Zhi-Qin John Xu, Yaoyu Zhang
Complex design problems are common in the scientific and industrial fields.
no code implementations • 13 Jul 2021 • Guangjie Leng, Yekun Zhu, Zhi-Qin John Xu
An in-domain GAN inversion approach is recently proposed to constraint the inverted code within the latent space by forcing the reconstructed image obtained from the inverted code within the real image space.
no code implementations • 8 Jul 2021 • Lulu Zhang, Tao Luo, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu, Zheng Ma
In this paper, we propose a a machine learning approach via model-operator-data network (MOD-Net) for solving PDEs.
no code implementations • NeurIPS 2021 • Yaoyu Zhang, Zhongwang Zhang, Tao Luo, Zhi-Qin John Xu
Understanding the structure of loss landscape of deep neural networks (DNNs)is obviously important.
no code implementations • 25 May 2021 • Hanxu Zhou, Qixuan Zhou, Tao Luo, Yaoyu Zhang, Zhi-Qin John Xu
Our theoretical analysis confirms experiments for two cases, one is for the activation function of multiplicity one with arbitrary dimension input, which contains many common activation functions, and the other is for the layer with one-dimensional input and arbitrary multiplicity.
no code implementations • 25 May 2021 • Tao Luo, Zheng Ma, Zhiwei Wang, Zhi-Qin John Xu, Yaoyu Zhang
frequency in DNN training.
no code implementations • 30 Jan 2021 • Yaoyu Zhang, Tao Luo, Zheng Ma, Zhi-Qin John Xu
Why heavily parameterized neural networks (NNs) do not overfit the data is an important long standing open question.
no code implementations • 4 Jan 2021 • Yuheng Ma, Zhi-Qin John Xu, Jiwei Zhang
Frequency perspective recently makes progress in understanding deep learning.
no code implementations • 6 Dec 2020 • Tao Luo, Zheng Ma, Zhiwei Wang, Zhi-Qin John Xu, Yaoyu Zhang
A supervised learning problem is to find a function in a hypothesis function space given values on isolated data points.
1 code implementation • 15 Oct 2020 • Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang
Recent works show an intriguing phenomenon of Frequency Principle (F-Principle) that deep neural networks (DNNs) fit the target function from low to high frequency during the training, which provides insight into the training and generalization behavior of DNNs in complex tasks.
no code implementations • 30 Sep 2020 • Xi-An Li, Zhi-Qin John Xu, Lei Zhang
Algorithms based on deep neural networks (DNNs) have attracted increasing attention from the scientific computing community.
Computational Physics Analysis of PDEs
2 code implementations • 29 Jul 2020 • Zhemin Li, Zhi-Qin John Xu, Tao Luo, Hongxia Wang
In this work, we propose a Regularized Deep Matrix Factorized (RDMF) model for image restoration, which utilizes the implicit bias of the low rank of deep neural networks and the explicit bias of total variation.
no code implementations • 28 Jul 2020 • Zhi-Qin John Xu, Hanxu Zhou
Due to the well-studied frequency principle, i. e., deep neural networks learn lower frequency functions faster, the deep frequency principle provides a reasonable explanation to why deeper learning is faster.
1 code implementation • 22 Jul 2020 • Ziqi Liu, Wei Cai, Zhi-Qin John Xu
In this paper, we propose multi-scale deep neural networks (MscaleDNNs) using the idea of radial scaling in frequency domain and activation functions with compact support.
1 code implementation • 15 Jul 2020 • Tao Luo, Zhi-Qin John Xu, Zheng Ma, Yaoyu Zhang
In this work, inspired by the phase diagram in statistical mechanics, we draw the phase diagram for the two-layer ReLU neural network at the infinite-width limit for a complete characterization of its dynamical regimes and their dependence on hyperparameters related to initialization.
no code implementations • 6 Dec 2019 • Zhi-Qin John Xu, Jiwei Zhang, Yaoyu Zhang, Chengchao Zhao
We first estimate \emph{a priori} generalization error of finite-width two-layer ReLU NN with constraint of minimal norm solution, which is proved by \cite{zhang2019type} to be an equivalent solution of a linearized (w. r. t.
no code implementations • 25 Oct 2019 • Wei Cai, Zhi-Qin John Xu
In this paper, we propose the idea of radial scaling in frequency domain and activation functions with compact support to produce a multi-scale DNN (MscaleDNN), which will have the multi-scale capability in approximating high frequency and high dimensional functions and speeding up the solution of high dimensional PDEs.
1 code implementation • 21 Jun 2019 • Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang
Along with fruitful applications of Deep Neural Networks (DNNs) to realistic problems, recently, some empirical studies of DNNs reported a universal phenomenon of Frequency Principle (F-Principle): a DNN tends to learn a target function from low to high frequencies during the training.
1 code implementation • 24 May 2019 • Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma
It remains a puzzle that why deep neural networks (DNNs), with more parameters than samples, often generalize well.
no code implementations • 19 May 2019 • Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma
Overall, our work serves as a baseline for the further investigation of the impact of initialization and loss function on the generalization of DNNs, which can potentially guide and improve the training of DNNs in practice.
3 code implementations • 19 Jan 2019 • Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo, Yanyang Xiao, Zheng Ma
We study the training process of Deep Neural Networks (DNNs) from the Fourier analysis perspective.
no code implementations • 26 Nov 2018 • Zhi-Qin John Xu
Previous studies have shown that deep neural networks (DNNs) with common settings often capture target functions from low to high frequency, which is called Frequency Principle (F-Principle).
no code implementations • 13 Aug 2018 • Zhi-Qin John Xu
Background: It is still an open research area to theoretically understand why Deep Neural Networks (DNNs)---equipped with many more parameters than training data and trained by (stochastic) gradient-based methods---often achieve remarkably low generalization error.
1 code implementation • 3 Jul 2018 • Zhi-Qin John Xu, Yaoyu Zhang, Yanyang Xiao
Why deep neural networks (DNNs) capable of overfitting often generalize well in practice is a mystery [#zhang2016understanding].