no code implementations • 25 Oct 2023 • Noam Levi, Alon Beck, Yohai Bar-Sinai
Grokking is the intriguing phenomenon where a model learns to generalize long after it has fit the training data.
Memorization