Search Results for author: Ethan Dyer

Found 11 papers, 3 papers with code

Explaining Neural Scaling Laws

1 code implementation12 Feb 2021 Yasaman Bahri, Ethan Dyer, Jared Kaplan, Jaehoon Lee, Utkarsh Sharma

In the large width limit, this can be equivalently obtained from the spectrum of certain kernels, and we present evidence that large width and large dataset resolution-limited scaling exponents are related by a duality.

Tradeoffs in Data Augmentation: An Empirical Study

no code implementations ICLR 2021 Raphael Gontijo-Lopes, Sylvia Smullin, Ekin Dogus Cubuk, Ethan Dyer

Though data augmentation has become a standard component of deep neural network training, the underlying mechanism behind the effectiveness of these techniques remains poorly understood.

Data Augmentation

The large learning rate phase of deep learning

1 code implementation1 Jan 2021 Aitor Lewkowycz, Yasaman Bahri, Ethan Dyer, Jascha Sohl-Dickstein, Guy Gur-Ari

In the small learning rate phase, training can be understood using the existing theory of infinitely wide neural networks.

When Do Curricula Work?

1 code implementation ICLR 2021 Xiaoxia Wu, Ethan Dyer, Behnam Neyshabur

Inspired by common use cases of curriculum learning in practice, we investigate the role of limited training time budget and noisy data in the success of curriculum learning.

Curriculum Learning

Asymptotics of Wide Convolutional Neural Networks

no code implementations19 Aug 2020 Anders Andreassen, Ethan Dyer

Wide neural networks have proven to be a rich class of architectures for both theory and practice.

Anatomy of Catastrophic Forgetting: Hidden Representations and Task Semantics

no code implementations ICLR 2021 Vinay V. Ramasesh, Ethan Dyer, Maithra Raghu

A central challenge in developing versatile machine learning systems is catastrophic forgetting: a model trained on tasks in sequence will suffer significant performance drops on earlier tasks.

The large learning rate phase of deep learning: the catapult mechanism

no code implementations4 Mar 2020 Aitor Lewkowycz, Yasaman Bahri, Ethan Dyer, Jascha Sohl-Dickstein, Guy Gur-Ari

In the small learning rate phase, training can be understood using the existing theory of infinitely wide neural networks.

Affinity and Diversity: Quantifying Mechanisms of Data Augmentation

no code implementations20 Feb 2020 Raphael Gontijo-Lopes, Sylvia J. Smullin, Ekin D. Cubuk, Ethan Dyer

Though data augmentation has become a standard component of deep neural network training, the underlying mechanism behind the effectiveness of these techniques remains poorly understood.

Data Augmentation

Asymptotics of Wide Networks from Feynman Diagrams

no code implementations ICLR 2020 Ethan Dyer, Guy Gur-Ari

Understanding the asymptotic behavior of wide networks is of considerable interest.

Gradient Descent Happens in a Tiny Subspace

no code implementations ICLR 2019 Guy Gur-Ari, Daniel A. Roberts, Ethan Dyer

We show that in a variety of large-scale deep learning scenarios the gradient dynamically converges to a very small subspace after a short period of training.

General Classification

Cannot find the paper you are looking for? You can Submit a new open access paper.