Search Results for author: Dachao Lin

Found 8 papers, 0 papers with code

Directional Convergence Analysis under Spherically Symmetric Distribution

no code implementations9 May 2021 Dachao Lin, Zhihua Zhang

We consider the fundamental problem of learning linear predictors (i. e., separable datasets with zero margin) using neural networks with gradient flow or gradient descent.

Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate in Gradient Descent

no code implementations12 Apr 2021 Guangzeng Xie, Hao Jin, Dachao Lin, Zhihua Zhang

We propose \textit{Meta-Regularization}, a novel approach for the adaptive choice of the learning rate in first-order gradient descent methods.

On the Landscape of Sparse Linear Networks

no code implementations1 Jan 2021 Dachao Lin, Ruoyu Sun, Zhihua Zhang

Network pruning, or sparse network has a long history and practical significance in modern applications.

Network Pruning

On the Landscape of One-hidden-layer Sparse Networks and Beyond

no code implementations16 Sep 2020 Dachao Lin, Ruoyu Sun, Zhihua Zhang

We show that sparse linear networks can have spurious strict minima, which is in sharp contrast to dense linear networks which do not even have spurious minima.

Network Pruning

Optimal Quantization for Batch Normalization in Neural Network Deployments and Beyond

no code implementations30 Aug 2020 Dachao Lin, Peiqin Sun, Guangzeng Xie, Shuchang Zhou, Zhihua Zhang

Quantized Neural Networks (QNNs) use low bit-width fixed-point numbers for representing weight parameters and activations, and are often used in real-world applications due to their saving of computation resources and reproducibility of results.

Affine Transformation Quantization

Towards Understanding the Importance of Noise in Training Neural Networks

no code implementations7 Sep 2019 Mo Zhou, Tianyi Liu, Yan Li, Dachao Lin, Enlu Zhou, Tuo Zhao

Numerous empirical evidence has corroborated that the noise plays a crucial rule in effective and efficient training of neural networks.

Towards Better Generalization: BP-SVRG in Training Deep Neural Networks

no code implementations18 Aug 2019 Hao Jin, Dachao Lin, Zhihua Zhang

Stochastic variance-reduced gradient (SVRG) is a classical optimization method.

Hyper-Regularization: An Adaptive Choice for the Learning Rate in Gradient Descent

no code implementations ICLR 2019 Guangzeng Xie, Hao Jin, Dachao Lin, Zhihua Zhang

Specifically, we impose a regularization term on the learning rate via a generalized distance, and cast the joint updating process of the parameter and the learning rate into a maxmin problem.

Cannot find the paper you are looking for? You can Submit a new open access paper.