Search Results for author: Janice Lan

Found 7 papers, 5 papers with code

Uncovering the impact of learning rate for global magnitude pruning

no code implementations1 Jan 2021 Janice Lan, Rudy Chin, Alexei Baevski, Ari S. Morcos

However, prior work has implicitly assumed that the best training configuration for model performance was also the best configuration for mask discovery.

First-Order Preconditioning via Hypergradient Descent

1 code implementation18 Oct 2019 Ted Moskovitz, Rui Wang, Janice Lan, Sanyam Kapoor, Thomas Miconi, Jason Yosinski, Aditya Rawal

Standard gradient descent methods are susceptible to a range of issues that can impede training, such as high correlations and different scaling in parameter space. These difficulties can be addressed by second-order approaches that apply a pre-conditioning matrix to the gradient to improve convergence.

LCA: Loss Change Allocation for Neural Network Training

2 code implementations NeurIPS 2019 Janice Lan, Rosanne Liu, Hattie Zhou, Jason Yosinski

We propose a new window into training called Loss Change Allocation (LCA), in which credit for changes to the network loss is conservatively partitioned to the parameters.

Deconstructing Lottery Tickets: Zeros, Signs, and the Supermask

6 code implementations NeurIPS 2019 Hattie Zhou, Janice Lan, Rosanne Liu, Jason Yosinski

The recent "Lottery Ticket Hypothesis" paper by Frankle & Carbin showed that a simple approach to creating sparse networks (keeping the large weights) results in models that are trainable from scratch, but only when starting from the same initial weights.

Cannot find the paper you are looking for? You can Submit a new open access paper.