Search Results for author: Fangshuo Liao

Found 6 papers, 0 papers with code

On the Error-Propagation of Inexact Deflation for Principal Component Analysis

no code implementations6 Oct 2023 Fangshuo Liao, Junhyung Lyle Kim, Cruz Barnum, Anastasios Kyrillidis

Principal Component Analysis (PCA) is a popular tool in data analysis, especially when the data is high-dimensional.

Provable Accelerated Convergence of Nesterov's Momentum for Deep ReLU Neural Networks

no code implementations13 Jun 2023 Fangshuo Liao, Anastasios Kyrillidis

Current state-of-the-art analyses on the convergence of gradient descent for training neural networks focus on characterizing properties of the loss landscape, such as the Polyak-Lojaciewicz (PL) condition and the restricted strong convexity.

Open-Ended Question Answering

Strong Lottery Ticket Hypothesis with $\varepsilon$--perturbation

no code implementations29 Oct 2022 Zheyang Xiong, Fangshuo Liao, Anastasios Kyrillidis

The strong Lottery Ticket Hypothesis (LTH) claims the existence of a subnetwork in a sufficiently large, randomly initialized neural network that approximates some target neural network without the need of training.

LOFT: Finding Lottery Tickets through Filter-wise Training

no code implementations28 Oct 2022 Qihan Wang, Chen Dun, Fangshuo Liao, Chris Jermaine, Anastasios Kyrillidis

\textsc{LoFT} is a model-parallel pretraining algorithm that partitions convolutional layers by filters to train them independently in a distributed setting, resulting in reduced memory and communication costs during pretraining.

On the Convergence of Shallow Neural Network Training with Randomly Masked Neurons

no code implementations5 Dec 2021 Fangshuo Liao, Anastasios Kyrillidis

With the motive of training all the parameters of a neural network, we study why and when one can achieve this by iteratively creating, training, and combining randomly selected subnetworks.

How much pre-training is enough to discover a good subnetwork?

no code implementations31 Jul 2021 Cameron R. Wolfe, Fangshuo Liao, Qihan Wang, Junhyung Lyle Kim, Anastasios Kyrillidis

Aiming to mathematically analyze the amount of dense network pre-training needed for a pruned network to perform well, we discover a simple theoretical bound in the number of gradient descent pre-training iterations on a two-layer, fully-connected network, beyond which pruning via greedy forward selection [61] yields a subnetwork that achieves good training error.

Network Pruning

Cannot find the paper you are looking for? You can Submit a new open access paper.