Search Results for author: Zhongwang Zhang

Found 9 papers, 0 papers with code

Anchor function: a type of benchmark functions for studying language models

no code implementations16 Jan 2024 Zhongwang Zhang, Zhiwei Wang, Junjie Yao, Zhangchen Zhou, Xiaolong Li, Weinan E, Zhi-Qin John Xu

However, language model research faces significant challenges, especially for academic research groups with constrained resources.

Language Modelling

Optimistic Estimate Uncovers the Potential of Nonlinear Models

no code implementations18 Jul 2023 Yaoyu Zhang, Zhongwang Zhang, Leyang Zhang, Zhiwei Bai, Tao Luo, Zhi-Qin John Xu

We propose an optimistic estimate to evaluate the best possible fitting performance of nonlinear models.

Stochastic Modified Equations and Dynamics of Dropout Algorithm

no code implementations25 May 2023 Zhongwang Zhang, Yuqing Li, Tao Luo, Zhi-Qin John Xu

In order to investigate the underlying mechanism by which dropout facilitates the identification of flatter minima, we study the noise structure of the derived stochastic modified equation for dropout.

Relation

Loss Spike in Training Neural Networks

no code implementations20 May 2023 Zhongwang Zhang, Zhi-Qin John Xu

In this work, we study the mechanism underlying loss spikes observed during neural network training.

Linear Stability Hypothesis and Rank Stratification for Nonlinear Models

no code implementations21 Nov 2022 Yaoyu Zhang, Zhongwang Zhang, Leyang Zhang, Zhiwei Bai, Tao Luo, Zhi-Qin John Xu

By these results, model rank of a target function predicts a minimal training data size for its successful recovery.

Implicit regularization of dropout

no code implementations13 Jul 2022 Zhongwang Zhang, Zhi-Qin John Xu

Secondly, we experimentally find that the training with dropout leads to the neural network with a flatter minimum compared with standard gradient descent training, and the implicit regularization is the key to finding flat solutions.

Embedding Principle: a hierarchical structure of loss landscape of deep neural networks

no code implementations30 Nov 2021 Yaoyu Zhang, Yuqing Li, Zhongwang Zhang, Tao Luo, Zhi-Qin John Xu

We prove a general Embedding Principle of loss landscape of deep neural networks (NNs) that unravels a hierarchical structure of the loss landscape of NNs, i. e., loss landscape of an NN contains all critical points of all the narrower NNs.

Dropout in Training Neural Networks: Flatness of Solution and Noise Structure

no code implementations1 Nov 2021 Zhongwang Zhang, Hanxu Zhou, Zhi-Qin John Xu

It is important to understand how the popular regularization method dropout helps the neural network training find a good generalization solution.

Cannot find the paper you are looking for? You can Submit a new open access paper.