no code implementations • 31 Jan 2023 • Elisabetta Cornacchia, Elchanan Mossel
Curriculum learning (CL) - training using samples that are generated and presented in a meaningful order - was introduced in the machine learning context around a decade ago.
1 code implementation • 26 May 2022 • Emmanuel Abbe, Samy Bengio, Elisabetta Cornacchia, Jon Kleinberg, Aryo Lotfi, Maithra Raghu, Chiyuan Zhang
More generally, the paper considers the learning of logical functions with gradient descent (GD) on neural networks.
1 code implementation • 22 Mar 2022 • Elisabetta Cornacchia, Francesca Mignacco, Rodrigo Veiga, Cédric Gerbelot, Bruno Loureiro, Lenka Zdeborová
For Gaussian teacher weights, we investigate the performance of ERM with both cross-entropy and square losses, and explore the role of ridge regularisation in approaching Bayes-optimality.
no code implementations • 25 Feb 2022 • Emmanuel Abbe, Elisabetta Cornacchia, Jan Hązła, Christopher Marquis
This paper introduces the notion of ``Initial Alignment'' (INAL) between a neural network at initialization and a target function.
no code implementations • 3 Nov 2021 • Elisabetta Cornacchia, Jan Hązła, Ido Nachum, Amir Yehudayoff
We study the implicit bias of ReLU neural networks trained by a variant of SGD where at each step, the label is changed with probability $p$ to a random label (label smoothing being a close variant of this procedure).
no code implementations • 29 Jan 2021 • Emmanuel Abbe, Elisabetta Cornacchia, Yuzhou Gu, Yury Polyanskiy
The limit of the entropy in the stochastic block model (SBM) has been characterized in the sparse regime for the special case of disassortative communities [COKPZ17] and for the classical case of assortative communities but in the dense regime [DAM16].
Probability Information Theory Information Theory