Search Results for author: Jeremy M. Cohen

Found 2 papers, 1 papers with code

Gradient Descent on Neural Networks Typically Occurs at the Edge of Stability

1 code implementation • ICLR 2021 • Jeremy M. Cohen, Simran Kaur, Yuanzhi Li, J. Zico Kolter, Ameet Talwalkar

We empirically demonstrate that full-batch gradient descent on neural network training objectives typically operates in a regime we call the Edge of Stability.

Paper
Code

Adaptive Gradient Methods at the Edge of Stability

no code implementations • 29 Jul 2022 • Jeremy M. Cohen, Behrooz Ghorbani, Shankar Krishnan, Naman Agarwal, Sourabh Medapati, Michal Badura, Daniel Suo, David Cardoze, Zachary Nado, George E. Dahl, Justin Gilmer

Very little is known about the training dynamics of adaptive gradient methods like Adam in deep learning.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.