Search Results for author: Yuqing Li

Found 10 papers, 3 papers with code

Demystifying Lazy Training of Neural Networks from a Macroscopic Viewpoint

no code implementations7 Apr 2024 Yuqing Li, Tao Luo, Qixuan Zhou

While NTK typically assumes that $\lim_{m\to\infty}\frac{\log \kappa}{\log m}=\frac{1}{2}$, and imposes each weight parameters to scale by the factor $\frac{1}{\sqrt{m}}$, in our theta-lazy regime, we discard the factor and relax the conditions to $\lim_{m\to\infty}\frac{\log \kappa}{\log m}>0$.

AC-EVAL: Evaluating Ancient Chinese Language Understanding in Large Language Models

1 code implementation11 Mar 2024 Yuting Wei, Yuanxing Xu, Xinru Wei, Simin Yang, Yangfu Zhu, Yuqing Li, Di Liu, Bin Wu

Given the importance of ancient Chinese in capturing the essence of rich historical and cultural heritage, the rapid advancements in Large Language Models (LLMs) necessitate benchmarks that can effectively evaluate their understanding of ancient contexts.

Philosophy Reading Comprehension

Dynamic Multi-Scale Context Aggregation for Conversational Aspect-Based Sentiment Quadruple Analysis

1 code implementation27 Sep 2023 Yuqing Li, Wenyuan Zhang, Binbin Li, Siyu Jia, Zisen Qi, Xingbang Tan

Conversational aspect-based sentiment quadruple analysis (DiaASQ) aims to extract the quadruple of target-aspect-opinion-sentiment within a dialogue.

Stochastic Modified Equations and Dynamics of Dropout Algorithm

no code implementations25 May 2023 Zhongwang Zhang, Yuqing Li, Tao Luo, Zhi-Qin John Xu

In order to investigate the underlying mechanism by which dropout facilitates the identification of flatter minima, we study the noise structure of the derived stochastic modified equation for dropout.

Relation

Understanding the Initial Condensation of Convolutional Neural Networks

no code implementations17 May 2023 Zhangchen Zhou, Hanxu Zhou, Yuqing Li, Zhi-Qin John Xu

Previous research has shown that fully-connected networks with small initialization and gradient-based training methods exhibit a phenomenon known as condensation during training.

Phase Diagram of Initial Condensation for Two-layer Neural Networks

no code implementations12 Mar 2023 Zhengan Chen, Yuqing Li, Tao Luo, Zhangchen Zhou, Zhi-Qin John Xu

The phenomenon of distinct behaviors exhibited by neural networks under varying scales of initialization remains an enigma in deep learning research.

Vocal Bursts Valence Prediction

Embedding Principle: a hierarchical structure of loss landscape of deep neural networks

no code implementations30 Nov 2021 Yaoyu Zhang, Yuqing Li, Zhongwang Zhang, Tao Luo, Zhi-Qin John Xu

We prove a general Embedding Principle of loss landscape of deep neural networks (NNs) that unravels a hierarchical structure of the loss landscape of NNs, i. e., loss landscape of an NN contains all critical points of all the narrower NNs.

Nonlinear Weighted Directed Acyclic Graph and A Priori Estimates for Neural Networks

no code implementations30 Mar 2021 Yuqing Li, Tao Luo, Chao Ma

In an attempt to better understand structural benefits and generalization power of deep neural networks, we firstly present a novel graph theoretical formulation of neural network models, including fully connected, residual network (ResNet) and densely connected networks (DenseNet).

Towards an Understanding of Residual Networks Using Neural Tangent Hierarchy (NTH)

no code implementations7 Jul 2020 Yuqing Li, Tao Luo, Nung Kwan Yip

Gradient descent yields zero training loss in polynomial time for deep neural networks despite non-convex nature of the objective function.

Cannot find the paper you are looking for? You can Submit a new open access paper.