Search Results for author: Daniel Kunin

Found 12 papers, 6 papers with code

Stochastic Collapse: How Gradient Noise Attracts SGD Dynamics Towards Simpler Subnetworks

1 code implementation • NeurIPS 2023 • Feng Chen, Daniel Kunin, Atsushi Yamamura, Surya Ganguli

In this work, we reveal a strong implicit bias of stochastic gradient descent (SGD) that drives overly expressive networks to much simpler subnetworks, thereby dramatically reducing the number of independent parameters, and improving generalization.

Paper
Code

The Asymmetric Maximum Margin Bias of Quasi-Homogeneous Neural Networks

no code implementations • 7 Oct 2022 • Daniel Kunin, Atsushi Yamamura, Chao Ma, Surya Ganguli

We introduce the class of quasi-homogeneous models, which is expressive enough to describe nearly all neural networks with homogeneous activations, even those with biases, residual connections, and normalization layers, while structured enough to enable geometric analysis of its gradient dynamics.

Paper
Add Code

Beyond the Quadratic Approximation: the Multiscale Structure of Neural Network Loss Landscapes

no code implementations • 24 Apr 2022 • Chao Ma, Daniel Kunin, Lei Wu, Lexing Ying

Numerically, we observe that neural network loss functions possesses a multiscale structure, manifested in two ways: (1) in a neighborhood of minima, the loss mixes a continuum of scales and grows subquadratically, and (2) in a larger region, the loss shows several separate scales clearly.

Paper
Add Code

Noether’s Learning Dynamics: Role of Symmetry Breaking in Neural Networks

no code implementations • NeurIPS 2021 • Hidenori Tanaka, Daniel Kunin

In nature, symmetry governs regularities, while symmetry breaking brings texture.

Paper
Add Code

Rethinking the limiting dynamics of SGD: modified loss, phase space oscillations, and anomalous diffusion

no code implementations • 29 Sep 2021 • Daniel Kunin, Javier Sagastuy-Brena, Lauren Gillespie, Eshed Margalit, Hidenori Tanaka, Surya Ganguli, Daniel LK Yamins

In this work we explore the limiting dynamics of deep neural networks trained with stochastic gradient descent (SGD).

Paper
Add Code

The Limiting Dynamics of SGD: Modified Loss, Phase Space Oscillations, and Anomalous Diffusion

1 code implementation • 19 Jul 2021 • Daniel Kunin, Javier Sagastuy-Brena, Lauren Gillespie, Eshed Margalit, Hidenori Tanaka, Surya Ganguli, Daniel L. K. Yamins

In this work we explore the limiting dynamics of deep neural networks trained with stochastic gradient descent (SGD).

Paper
Code

Noether's Learning Dynamics: Role of Symmetry Breaking in Neural Networks

no code implementations • 6 May 2021 • Hidenori Tanaka, Daniel Kunin

In nature, symmetry governs regularities, while symmetry breaking brings texture.

Paper
Add Code

Symmetry, Conservation Laws, and Learning Dynamics in Neural Networks

no code implementations • ICLR 2021 • Daniel Kunin, Javier Sagastuy-Brena, Surya Ganguli, Daniel LK Yamins, Hidenori Tanaka

Overall, by exploiting symmetry, our work demonstrates that we can analytically describe the learning dynamics of various parameter combinations at finite learning rates and batch sizes for state of the art architectures trained on any dataset.

Paper
Add Code

Neural Mechanics: Symmetry and Broken Conservation Laws in Deep Learning Dynamics

1 code implementation • 8 Dec 2020 • Daniel Kunin, Javier Sagastuy-Brena, Surya Ganguli, Daniel L. K. Yamins, Hidenori Tanaka

Paper
Code

Pruning neural networks without any data by iteratively conserving synaptic flow

5 code implementations • NeurIPS 2020 • Hidenori Tanaka, Daniel Kunin, Daniel L. K. Yamins, Surya Ganguli

Pruning the parameters of deep neural networks has generated intense interest due to potential savings in time, memory and energy both during training and at test time.

210

Paper
Code

Two Routes to Scalable Credit Assignment without Weight Symmetry

1 code implementation • ICML 2020 • Daniel Kunin, Aran Nayebi, Javier Sagastuy-Brena, Surya Ganguli, Jonathan M. Bloom, Daniel L. K. Yamins

The neural plausibility of backpropagation has long been disputed, primarily for its use of non-local weight transport $-$ the biologically dubious requirement that one neuron instantaneously measure the synaptic weights of another.

Vocal Bursts Valence Prediction

Paper
Code

Loss Landscapes of Regularized Linear Autoencoders

2 code implementations • 23 Jan 2019 • Daniel Kunin, Jonathan M. Bloom, Aleksandrina Goeva, Cotton Seed

Autoencoders are a deep learning model for representation learning.

Representation Learning

143

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.