no code implementations • 22 Nov 2021 • Anna Kerekes, Anna Mészáros, Ferenc Huszár
In gradient descent, changing how we parametrize the model can lead to drastically different optimization trajectories, giving rise to a surprising range of meaningful inductive biases: identifying sparse classifiers or reconstructing low-rank matrices without explicit regularization.
no code implementations • 19 Oct 2022 • Szilvia Ujváry, Zsigmond Telek, Anna Kerekes, Anna Mészáros, Ferenc Huszár
Sharpness-aware minimization (SAM) aims to improve the generalisation of gradient-based learning by seeking out flat minima.
no code implementations • 16 May 2023 • Francisco Vargas, Teodora Reu, Anna Kerekes
Denoising diffusion models are a class of generative models which have recently achieved state-of-the-art results across many domains.
no code implementations • 3 May 2024 • Patrik Reizinger, Szilvia Ujváry, Anna Mészáros, Anna Kerekes, Wieland Brendel, Ferenc Huszár
The last decade has seen blossoming research in deep learning theory attempting to answer, "Why does deep learning generalize?"