1 code implementation • ICLR 2022 • Brett W. Larsen, Stanislav Fort, Nic Becker, Surya Ganguli
In particular, we show via Gordon's escape theorem, that the training dimension plus the Gaussian width of the desired loss sub-level set, projected onto a unit sphere surrounding the initialization, must exceed the total number of parameters for the success probability to be large.
1 code implementation • 2 Jun 2022 • Mansheej Paul, Brett W. Larsen, Surya Ganguli, Jonathan Frankle, Gintare Karolina Dziugaite
A striking observation about iterative magnitude pruning (IMP; Frankle et al. 2020) is that $\unicode{x2014}$ after just a few hundred steps of dense training $\unicode{x2014}$ the method can find a sparse sub-network that can be trained to the same accuracy as the dense network.
1 code implementation • 9 Oct 2023 • Dean A. Pospisil, Brett W. Larsen, Sarah E. Harvey, Alex H. Williams
Measuring geometric similarity between high-dimensional network representations is a topic of longstanding interest to neuroscience and deep learning.
1 code implementation • 31 Dec 2019 • Abbas Kazemipour, Brett W. Larsen, Shaul Druckmann
Despite their practical success, a theoretical understanding of the loss landscape of neural networks has proven challenging due to the high-dimensional, non-convex, and highly nonlinear structure of such models.
no code implementations • 6 Oct 2022 • Mansheej Paul, Feng Chen, Brett W. Larsen, Jonathan Frankle, Surya Ganguli, Gintare Karolina Dziugaite
Third, we show how the flatness of the error landscape at the end of training determines a limit on the fraction of weights that can be pruned at each iteration of IMP.
no code implementations • 19 Nov 2023 • Sarah E. Harvey, Brett W. Larsen, Alex H. Williams
A multitude of (dis)similarity measures between neural network representations have been proposed, resulting in a fragmented research landscape.