1 code implementation • NeurIPS 2023 • Feng Chen, Daniel Kunin, Atsushi Yamamura, Surya Ganguli
In this work, we reveal a strong implicit bias of stochastic gradient descent (SGD) that drives overly expressive networks to much simpler subnetworks, thereby dramatically reducing the number of independent parameters, and improving generalization.
no code implementations • 7 Oct 2022 • Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli
We introduce the class of quasi-homogeneous models, which is expressive enough to describe nearly all neural networks with homogeneous activations, even those with biases, residual connections, and normalization layers, while structured enough to enable geometric analysis of its gradient dynamics.
no code implementations • 24 Apr 2022 • Chao Ma, Daniel Kunin, Lei Wu, Lexing Ying
Numerically, we observe that neural network loss functions possesses a multiscale structure, manifested in two ways: (1) in a neighborhood of minima, the loss mixes a continuum of scales and grows subquadratically, and (2) in a larger region, the loss shows several separate scales clearly.
no code implementations • NeurIPS 2021 • Hidenori Tanaka, Daniel Kunin
In nature, symmetry governs regularities, while symmetry breaking brings texture.
no code implementations • 29 Sep 2021 • Daniel Kunin, Javier Sagastuy-Brena, Lauren Gillespie, Eshed Margalit, Hidenori Tanaka, Surya Ganguli, Daniel LK Yamins
In this work we explore the limiting dynamics of deep neural networks trained with stochastic gradient descent (SGD).
1 code implementation • 19 Jul 2021 • Daniel Kunin, Javier Sagastuy-Brena, Lauren Gillespie, Eshed Margalit, Hidenori Tanaka, Surya Ganguli, Daniel L. K. Yamins
In this work we explore the limiting dynamics of deep neural networks trained with stochastic gradient descent (SGD).
no code implementations • 6 May 2021 • Hidenori Tanaka, Daniel Kunin
In nature, symmetry governs regularities, while symmetry breaking brings texture.
no code implementations • ICLR 2021 • Daniel Kunin, Javier Sagastuy-Brena, Surya Ganguli, Daniel LK Yamins, Hidenori Tanaka
Overall, by exploiting symmetry, our work demonstrates that we can analytically describe the learning dynamics of various parameter combinations at finite learning rates and batch sizes for state of the art architectures trained on any dataset.
1 code implementation • 8 Dec 2020 • Daniel Kunin, Javier Sagastuy-Brena, Surya Ganguli, Daniel L. K. Yamins, Hidenori Tanaka
Overall, by exploiting symmetry, our work demonstrates that we can analytically describe the learning dynamics of various parameter combinations at finite learning rates and batch sizes for state of the art architectures trained on any dataset.
5 code implementations • NeurIPS 2020 • Hidenori Tanaka, Daniel Kunin, Daniel L. K. Yamins, Surya Ganguli
Pruning the parameters of deep neural networks has generated intense interest due to potential savings in time, memory and energy both during training and at test time.
1 code implementation • ICML 2020 • Daniel Kunin, Aran Nayebi, Javier Sagastuy-Brena, Surya Ganguli, Jonathan M. Bloom, Daniel L. K. Yamins
The neural plausibility of backpropagation has long been disputed, primarily for its use of non-local weight transport $-$ the biologically dubious requirement that one neuron instantaneously measure the synaptic weights of another.
2 code implementations • 23 Jan 2019 • Daniel Kunin, Jonathan M. Bloom, Aleksandrina Goeva, Cotton Seed
Autoencoders are a deep learning model for representation learning.