Search Results for author: Zhongwang Zhang

Found 9 papers, 0 papers with code

Anchor function: a type of benchmark functions for studying language models

no code implementations • 16 Jan 2024 • Zhongwang Zhang, Zhiwei Wang, Junjie Yao, Zhangchen Zhou, Xiaolong Li, Weinan E, Zhi-Qin John Xu

However, language model research faces significant challenges, especially for academic research groups with constrained resources.

Language Modelling

Paper
Add Code

Optimistic Estimate Uncovers the Potential of Nonlinear Models

no code implementations • 18 Jul 2023 • Yaoyu Zhang, Zhongwang Zhang, Leyang Zhang, Zhiwei Bai, Tao Luo, Zhi-Qin John Xu

We propose an optimistic estimate to evaluate the best possible fitting performance of nonlinear models.

Paper
Add Code

Stochastic Modified Equations and Dynamics of Dropout Algorithm

no code implementations • 25 May 2023 • Zhongwang Zhang, Yuqing Li, Tao Luo, Zhi-Qin John Xu

In order to investigate the underlying mechanism by which dropout facilitates the identification of flatter minima, we study the noise structure of the derived stochastic modified equation for dropout.

Relation

Paper
Add Code

Loss Spike in Training Neural Networks

no code implementations • 20 May 2023 • Zhongwang Zhang, Zhi-Qin John Xu

In this work, we study the mechanism underlying loss spikes observed during neural network training.

Paper
Add Code

Linear Stability Hypothesis and Rank Stratification for Nonlinear Models

no code implementations • 21 Nov 2022 • Yaoyu Zhang, Zhongwang Zhang, Leyang Zhang, Zhiwei Bai, Tao Luo, Zhi-Qin John Xu

By these results, model rank of a target function predicts a minimal training data size for its successful recovery.

Paper
Add Code

Implicit regularization of dropout

no code implementations • 13 Jul 2022 • Zhongwang Zhang, Zhi-Qin John Xu

Secondly, we experimentally find that the training with dropout leads to the neural network with a flatter minimum compared with standard gradient descent training, and the implicit regularization is the key to finding flat solutions.

Paper
Add Code

Embedding Principle: a hierarchical structure of loss landscape of deep neural networks

no code implementations • 30 Nov 2021 • Yaoyu Zhang, Yuqing Li, Zhongwang Zhang, Tao Luo, Zhi-Qin John Xu

We prove a general Embedding Principle of loss landscape of deep neural networks (NNs) that unravels a hierarchical structure of the loss landscape of NNs, i. e., loss landscape of an NN contains all critical points of all the narrower NNs.

Paper
Add Code

Dropout in Training Neural Networks: Flatness of Solution and Noise Structure

no code implementations • 1 Nov 2021 • Zhongwang Zhang, Hanxu Zhou, Zhi-Qin John Xu

It is important to understand how the popular regularization method dropout helps the neural network training find a good generalization solution.

Paper
Add Code

Embedding Principle of Loss Landscape of Deep Neural Networks

no code implementations • NeurIPS 2021 • Yaoyu Zhang, Zhongwang Zhang, Tao Luo, Zhi-Qin John Xu

Understanding the structure of loss landscape of deep neural networks (DNNs)is obviously important.

Protein Folding

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.