Search Results for author: Zhi-Qin John Xu

Found 46 papers, 9 papers with code

Understanding Time Series Anomaly State Detection through One-Class Classification

no code implementations3 Feb 2024 Hanxu Zhou, Yuan Zhang, Guangjie Leng, Ruofan Wang, Zhi-Qin John Xu

Therefore, in this article, we try to re-understand and define the time series anomaly detection problem through OCC, which we call 'time series anomaly state detection problem'.

One-Class Classification Time Series +2

Anchor function: a type of benchmark functions for studying language models

no code implementations16 Jan 2024 Zhongwang Zhang, Zhiwei Wang, Junjie Yao, Zhangchen Zhou, Xiaolong Li, Weinan E, Zhi-Qin John Xu

However, language model research faces significant challenges, especially for academic research groups with constrained resources.

Language Modelling

An Unsupervised Deep Learning Approach for the Wave Equation Inverse Problem

no code implementations8 Nov 2023 Xiong-bin Yan, Keke Wu, Zhi-Qin John Xu, Zheng Ma

Full-waveform inversion (FWI) is a powerful geophysical imaging technique that infers high-resolution subsurface physical parameters by solving a non-convex optimization problem.

Bayesian Inference

Optimistic Estimate Uncovers the Potential of Nonlinear Models

no code implementations18 Jul 2023 Yaoyu Zhang, Zhongwang Zhang, Leyang Zhang, Zhiwei Bai, Tao Luo, Zhi-Qin John Xu

We propose an optimistic estimate to evaluate the best possible fitting performance of nonlinear models.

Stochastic Modified Equations and Dynamics of Dropout Algorithm

no code implementations25 May 2023 Zhongwang Zhang, Yuqing Li, Tao Luo, Zhi-Qin John Xu

In order to investigate the underlying mechanism by which dropout facilitates the identification of flatter minima, we study the noise structure of the derived stochastic modified equation for dropout.

Relation

Loss Spike in Training Neural Networks

no code implementations20 May 2023 Zhongwang Zhang, Zhi-Qin John Xu

In this work, we study the mechanism underlying loss spikes observed during neural network training.

Understanding the Initial Condensation of Convolutional Neural Networks

no code implementations17 May 2023 Zhangchen Zhou, Hanxu Zhou, Yuqing Li, Zhi-Qin John Xu

Previous research has shown that fully-connected networks with small initialization and gradient-based training methods exhibit a phenomenon known as condensation during training.

Laplace-fPINNs: Laplace-based fractional physics-informed neural networks for solving forward and inverse problems of subdiffusion

no code implementations3 Apr 2023 Xiong-bin Yan, Zhi-Qin John Xu, Zheng Ma

To address this issue, this paper proposes an extension to PINNs called Laplace-based fractional physics-informed neural networks (Laplace-fPINNs), which can effectively solve the forward and inverse problems of fractional diffusion equations.

Phase Diagram of Initial Condensation for Two-layer Neural Networks

no code implementations12 Mar 2023 Zhengan Chen, Yuqing Li, Tao Luo, Zhangchen Zhou, Zhi-Qin John Xu

The phenomenon of distinct behaviors exhibited by neural networks under varying scales of initialization remains an enigma in deep learning research.

Vocal Bursts Valence Prediction

Bayesian Inversion with Neural Operator (BINO) for Modeling Subdiffusion: Forward and Inverse Problems

no code implementations22 Nov 2022 Xiong-bin Yan, Zhi-Qin John Xu, Zheng Ma

A large number of numerical experiments demonstrate that the operator learning method proposed in this work can efficiently solve the forward problems and Bayesian inverse problems of the subdiffusion equation.

Operator learning

Linear Stability Hypothesis and Rank Stratification for Nonlinear Models

no code implementations21 Nov 2022 Yaoyu Zhang, Zhongwang Zhang, Leyang Zhang, Zhiwei Bai, Tao Luo, Zhi-Qin John Xu

By these results, model rank of a target function predicts a minimal training data size for its successful recovery.

Implicit regularization of dropout

no code implementations13 Jul 2022 Zhongwang Zhang, Zhi-Qin John Xu

Secondly, we experimentally find that the training with dropout leads to the neural network with a flatter minimum compared with standard gradient descent training, and the implicit regularization is the key to finding flat solutions.

Embedding Principle in Depth for the Loss Landscape Analysis of Deep Neural Networks

no code implementations26 May 2022 Zhiwei Bai, Tao Luo, Zhi-Qin John Xu, Yaoyu Zhang

Regarding the easy training of deep networks, we show that local minimum of an NN can be lifted to strict saddle points of a deeper NN.

An Experimental Comparison Between Temporal Difference and Residual Gradient with Neural Network Approximation

no code implementations25 May 2022 Shuyu Yin, Tao Luo, Peilin Liu, Zhi-Qin John Xu

In this work, we perform extensive experiments to show that TD outperforms RG, that is, when the training leads to a small Bellman residual error, the solution found by TD has a better policy and is more robust against the perturbation of neural network parameters.

Q-Learning reinforcement-learning +1

Empirical Phase Diagram for Three-layer Neural Networks with Infinite Width

no code implementations24 May 2022 Hanxu Zhou, Qixuan Zhou, Zhenyuan Jin, Tao Luo, Yaoyu Zhang, Zhi-Qin John Xu

Through experiments under three-layer condition, our phase diagram suggests a complicated dynamical regimes consisting of three possible regimes, together with their mixture, for deep NNs and provides a guidance for studying deep NNs in different initialization regimes, which reveals the possibility of completely different dynamics emerging within a deep NN for its different layers.

Limitation of Characterizing Implicit Regularization by Data-independent Functions

no code implementations28 Jan 2022 Leyang Zhang, Zhi-Qin John Xu, Tao Luo, Yaoyu Zhang

In recent years, understanding the implicit regularization of neural networks (NNs) has become a central task in deep learning theory.

Learning Theory

Overview frequency principle/spectral bias in deep learning

no code implementations19 Jan 2022 Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo

This low-frequency implicit bias reveals the strength of neural network in learning low-frequency functions as well as its deficiency in learning high-frequency functions.

A multi-scale sampling method for accurate and robust deep neural network to predict combustion chemical kinetics

no code implementations9 Jan 2022 Tianhan Zhang, Yuxiao Yi, Yifan Xu, Zhi X. Chen, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu

The current work aims to understand two basic questions regarding the deep neural network (DNN) method: what data the DNN needs and how general the DNN method can be.

A deep learning-based model reduction (DeePMR) method for simplifying chemical kinetics

no code implementations6 Jan 2022 Zhiwei Wang, Yaoyu Zhang, Enhan Zhao, Yiguang Ju, Weinan E, Zhi-Qin John Xu, Tianhan Zhang

The mechanism reduction is modeled as an optimization problem on Boolean space, where a Boolean vector, each entry corresponding to a species, represents a reduced mechanism.

Subspace Decomposition based DNN algorithm for elliptic-type multi-scale PDEs

1 code implementation10 Dec 2021 Xi-An Li, Zhi-Qin John Xu, Lei Zhang

Numerical results show that the SD$^2$NN model is superior to existing models such as MscaleDNN.

Vocal Bursts Type Prediction

Embedding Principle: a hierarchical structure of loss landscape of deep neural networks

no code implementations30 Nov 2021 Yaoyu Zhang, Yuqing Li, Zhongwang Zhang, Tao Luo, Zhi-Qin John Xu

We prove a general Embedding Principle of loss landscape of deep neural networks (NNs) that unravels a hierarchical structure of the loss landscape of NNs, i. e., loss landscape of an NN contains all critical points of all the narrower NNs.

Dropout in Training Neural Networks: Flatness of Solution and Noise Structure

no code implementations1 Nov 2021 Zhongwang Zhang, Hanxu Zhou, Zhi-Qin John Xu

It is important to understand how the popular regularization method dropout helps the neural network training find a good generalization solution.

Data-informed Deep Optimization

no code implementations17 Jul 2021 Lulu Zhang, Zhi-Qin John Xu, Yaoyu Zhang

Complex design problems are common in the scientific and industrial fields.

Force-in-domain GAN inversion

no code implementations13 Jul 2021 Guangjie Leng, Yekun Zhu, Zhi-Qin John Xu

An in-domain GAN inversion approach is recently proposed to constraint the inverted code within the latent space by forcing the reconstructed image obtained from the inverted code within the real image space.

MOD-Net: A Machine Learning Approach via Model-Operator-Data Network for Solving PDEs

no code implementations8 Jul 2021 Lulu Zhang, Tao Luo, Yaoyu Zhang, Weinan E, Zhi-Qin John Xu, Zheng Ma

In this paper, we propose a a machine learning approach via model-operator-data network (MOD-Net) for solving PDEs.

Towards Understanding the Condensation of Neural Networks at Initial Training

no code implementations25 May 2021 Hanxu Zhou, Qixuan Zhou, Tao Luo, Yaoyu Zhang, Zhi-Qin John Xu

Our theoretical analysis confirms experiments for two cases, one is for the activation function of multiplicity one with arbitrary dimension input, which contains many common activation functions, and the other is for the layer with one-dimensional input and arbitrary multiplicity.

Linear Frequency Principle Model to Understand the Absence of Overfitting in Neural Networks

no code implementations30 Jan 2021 Yaoyu Zhang, Tao Luo, Zheng Ma, Zhi-Qin John Xu

Why heavily parameterized neural networks (NNs) do not overfit the data is an important long standing open question.

Open-Ended Question Answering

Frequency Principle in Deep Learning Beyond Gradient-descent-based Training

no code implementations4 Jan 2021 Yuheng Ma, Zhi-Qin John Xu, Jiwei Zhang

Frequency perspective recently makes progress in understanding deep learning.

Fourier-domain Variational Formulation and Its Well-posedness for Supervised Learning

no code implementations6 Dec 2020 Tao Luo, Zheng Ma, Zhiwei Wang, Zhi-Qin John Xu, Yaoyu Zhang

A supervised learning problem is to find a function in a hypothesis function space given values on isolated data points.

On the exact computation of linear frequency principle dynamics and its generalization

1 code implementation15 Oct 2020 Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang

Recent works show an intriguing phenomenon of Frequency Principle (F-Principle) that deep neural networks (DNNs) fit the target function from low to high frequency during the training, which provides insight into the training and generalization behavior of DNNs in complex tasks.

A multi-scale DNN algorithm for nonlinear elliptic equations with multiple scales

no code implementations30 Sep 2020 Xi-An Li, Zhi-Qin John Xu, Lei Zhang

Algorithms based on deep neural networks (DNNs) have attracted increasing attention from the scientific computing community.

Computational Physics Analysis of PDEs

A regularized deep matrix factorized model of matrix completion for image restoration

2 code implementations29 Jul 2020 Zhemin Li, Zhi-Qin John Xu, Tao Luo, Hongxia Wang

In this work, we propose a Regularized Deep Matrix Factorized (RDMF) model for image restoration, which utilizes the implicit bias of the low rank of deep neural networks and the explicit bias of total variation.

Image Restoration Matrix Completion

Deep frequency principle towards understanding why deeper learning is faster

no code implementations28 Jul 2020 Zhi-Qin John Xu, Hanxu Zhou

Due to the well-studied frequency principle, i. e., deep neural networks learn lower frequency functions faster, the deep frequency principle provides a reasonable explanation to why deeper learning is faster.

Multi-scale Deep Neural Network (MscaleDNN) for Solving Poisson-Boltzmann Equation in Complex Domains

1 code implementation22 Jul 2020 Ziqi Liu, Wei Cai, Zhi-Qin John Xu

In this paper, we propose multi-scale deep neural networks (MscaleDNNs) using the idea of radial scaling in frequency domain and activation functions with compact support.

Phase diagram for two-layer ReLU neural networks at infinite-width limit

1 code implementation15 Jul 2020 Tao Luo, Zhi-Qin John Xu, Zheng Ma, Yaoyu Zhang

In this work, inspired by the phase diagram in statistical mechanics, we draw the phase diagram for the two-layer ReLU neural network at the infinite-width limit for a complete characterization of its dynamical regimes and their dependence on hyperparameters related to initialization.

A priori generalization error for two-layer ReLU neural network through minimum norm solution

no code implementations6 Dec 2019 Zhi-Qin John Xu, Jiwei Zhang, Yaoyu Zhang, Chengchao Zhao

We first estimate \emph{a priori} generalization error of finite-width two-layer ReLU NN with constraint of minimal norm solution, which is proved by \cite{zhang2019type} to be an equivalent solution of a linearized (w. r. t.

Multi-scale Deep Neural Networks for Solving High Dimensional PDEs

no code implementations25 Oct 2019 Wei Cai, Zhi-Qin John Xu

In this paper, we propose the idea of radial scaling in frequency domain and activation functions with compact support to produce a multi-scale DNN (MscaleDNN), which will have the multi-scale capability in approximating high frequency and high dimensional functions and speeding up the solution of high dimensional PDEs.

Vocal Bursts Intensity Prediction

Theory of the Frequency Principle for General Deep Neural Networks

1 code implementation21 Jun 2019 Tao Luo, Zheng Ma, Zhi-Qin John Xu, Yaoyu Zhang

Along with fruitful applications of Deep Neural Networks (DNNs) to realistic problems, recently, some empirical studies of DNNs reported a universal phenomenon of Frequency Principle (F-Principle): a DNN tends to learn a target function from low to high frequencies during the training.

Explicitizing an Implicit Bias of the Frequency Principle in Two-layer Neural Networks

1 code implementation24 May 2019 Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

It remains a puzzle that why deep neural networks (DNNs), with more parameters than samples, often generalize well.

A type of generalization error induced by initialization in deep neural networks

no code implementations19 May 2019 Yaoyu Zhang, Zhi-Qin John Xu, Tao Luo, Zheng Ma

Overall, our work serves as a baseline for the further investigation of the impact of initialization and loss function on the generalization of DNNs, which can potentially guide and improve the training of DNNs in practice.

Frequency Principle: Fourier Analysis Sheds Light on Deep Neural Networks

3 code implementations19 Jan 2019 Zhi-Qin John Xu, Yaoyu Zhang, Tao Luo, Yanyang Xiao, Zheng Ma

We study the training process of Deep Neural Networks (DNNs) from the Fourier analysis perspective.

Frequency Principle in Deep Learning with General Loss Functions and Its Potential Application

no code implementations26 Nov 2018 Zhi-Qin John Xu

Previous studies have shown that deep neural networks (DNNs) with common settings often capture target functions from low to high frequency, which is called Frequency Principle (F-Principle).

Understanding training and generalization in deep learning by Fourier analysis

no code implementations13 Aug 2018 Zhi-Qin John Xu

Background: It is still an open research area to theoretically understand why Deep Neural Networks (DNNs)---equipped with many more parameters than training data and trained by (stochastic) gradient-based methods---often achieve remarkably low generalization error.

Training behavior of deep neural network in frequency domain

1 code implementation3 Jul 2018 Zhi-Qin John Xu, Yaoyu Zhang, Yanyang Xiao

Why deep neural networks (DNNs) capable of overfitting often generalize well in practice is a mystery [#zhang2016understanding].

Cannot find the paper you are looking for? You can Submit a new open access paper.