Search Results for author: Dachao Lin

Found 12 papers, 0 papers with code

On the Convergence of Policy in Unregularized Policy Mirror Descent

no code implementations • 17 May 2022 • Dachao Lin, Zhihua Zhang

In this short note, we give the convergence analysis of the policy in the recent famous policy mirror descent (PMD).

Paper
Add Code

Global Convergence Analysis of Deep Linear Networks with A One-neuron Layer

no code implementations • 8 Jan 2022 • Kun Chen, Dachao Lin, Zhihua Zhang

In this paper, we follow Eftekhari's work to give a non-local convergence analysis of deep linear networks.

Paper
Add Code

Faster Directional Convergence of Linear Neural Networks under Spherically Symmetric Data

no code implementations • NeurIPS 2021 • Dachao Lin, Ruoyu Sun, Zhihua Zhang

In this paper, we study gradient methods for training deep linear neural networks with binary cross-entropy loss.

Paper
Add Code

Greedy and Random Quasi-Newton Methods with Faster Explicit Superlinear Convergence

no code implementations • NeurIPS 2021 • Dachao Lin, Haishan Ye, Zhihua Zhang

In this paper, we follow Rodomanov and Nesterov’s work to study quasi-Newton methods.

Paper
Add Code

Directional Convergence Analysis under Spherically Symmetric Distribution

no code implementations • 9 May 2021 • Dachao Lin, Zhihua Zhang

We consider the fundamental problem of learning linear predictors (i. e., separable datasets with zero margin) using neural networks with gradient flow or gradient descent.

Paper
Add Code

Meta-Regularization: An Approach to Adaptive Choice of the Learning Rate in Gradient Descent

no code implementations • 12 Apr 2021 • Guangzeng Xie, Hao Jin, Dachao Lin, Zhihua Zhang

We propose \textit{Meta-Regularization}, a novel approach for the adaptive choice of the learning rate in first-order gradient descent methods.

Paper
Add Code

On the Landscape of Sparse Linear Networks

no code implementations • 1 Jan 2021 • Dachao Lin, Ruoyu Sun, Zhihua Zhang

Network pruning, or sparse network has a long history and practical significance in modern applications.

Network Pruning

Paper
Add Code

On the Landscape of One-hidden-layer Sparse Networks and Beyond

no code implementations • 16 Sep 2020 • Dachao Lin, Ruoyu Sun, Zhihua Zhang

We show that linear networks can have no spurious valleys under special sparse structures, and non-linear networks could also admit no spurious valleys under a wide final layer.

Network Pruning

Paper
Add Code

Optimal Quantization for Batch Normalization in Neural Network Deployments and Beyond

no code implementations • 30 Aug 2020 • Dachao Lin, Peiqin Sun, Guangzeng Xie, Shuchang Zhou, Zhihua Zhang

Quantized Neural Networks (QNNs) use low bit-width fixed-point numbers for representing weight parameters and activations, and are often used in real-world applications due to their saving of computation resources and reproducibility of results.

Quantization

Paper
Add Code

Towards Understanding the Importance of Noise in Training Neural Networks

no code implementations • 7 Sep 2019 • Mo Zhou, Tianyi Liu, Yan Li, Dachao Lin, Enlu Zhou, Tuo Zhao

Numerous empirical evidence has corroborated that the noise plays a crucial rule in effective and efficient training of neural networks.

Paper
Add Code

Towards Better Generalization: BP-SVRG in Training Deep Neural Networks

no code implementations • 18 Aug 2019 • Hao Jin, Dachao Lin, Zhihua Zhang

Stochastic variance-reduced gradient (SVRG) is a classical optimization method.

Paper
Add Code

Hyper-Regularization: An Adaptive Choice for the Learning Rate in Gradient Descent

no code implementations • ICLR 2019 • Guangzeng Xie, Hao Jin, Dachao Lin, Zhihua Zhang

Specifically, we impose a regularization term on the learning rate via a generalized distance, and cast the joint updating process of the parameter and the learning rate into a maxmin problem.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.