Search Results for author: Fangshuo Liao

Found 6 papers, 0 papers with code

On the Error-Propagation of Inexact Deflation for Principal Component Analysis

no code implementations • 6 Oct 2023 • Fangshuo Liao, Junhyung Lyle Kim, Cruz Barnum, Anastasios Kyrillidis

Principal Component Analysis (PCA) is a popular tool in data analysis, especially when the data is high-dimensional.

Paper
Add Code

Provable Accelerated Convergence of Nesterov's Momentum for Deep ReLU Neural Networks

no code implementations • 13 Jun 2023 • Fangshuo Liao, Anastasios Kyrillidis

Current state-of-the-art analyses on the convergence of gradient descent for training neural networks focus on characterizing properties of the loss landscape, such as the Polyak-Lojaciewicz (PL) condition and the restricted strong convexity.

Open-Ended Question Answering

Paper
Add Code

Strong Lottery Ticket Hypothesis with $\varepsilon$--perturbation

no code implementations • 29 Oct 2022 • Zheyang Xiong, Fangshuo Liao, Anastasios Kyrillidis

The strong Lottery Ticket Hypothesis (LTH) claims the existence of a subnetwork in a sufficiently large, randomly initialized neural network that approximates some target neural network without the need of training.

Paper
Add Code

LOFT: Finding Lottery Tickets through Filter-wise Training

no code implementations • 28 Oct 2022 • Qihan Wang, Chen Dun, Fangshuo Liao, Chris Jermaine, Anastasios Kyrillidis

\textsc{LoFT} is a model-parallel pretraining algorithm that partitions convolutional layers by filters to train them independently in a distributed setting, resulting in reduced memory and communication costs during pretraining.

Paper
Add Code

On the Convergence of Shallow Neural Network Training with Randomly Masked Neurons

no code implementations • 5 Dec 2021 • Fangshuo Liao, Anastasios Kyrillidis

With the motive of training all the parameters of a neural network, we study why and when one can achieve this by iteratively creating, training, and combining randomly selected subnetworks.

Paper
Add Code

How much pre-training is enough to discover a good subnetwork?

no code implementations • 31 Jul 2021 • Cameron R. Wolfe, Fangshuo Liao, Qihan Wang, Junhyung Lyle Kim, Anastasios Kyrillidis

Aiming to mathematically analyze the amount of dense network pre-training needed for a pruned network to perform well, we discover a simple theoretical bound in the number of gradient descent pre-training iterations on a two-layer, fully-connected network, beyond which pruning via greedy forward selection [61] yields a subnetwork that achieves good training error.

Network Pruning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.