1 code implementation • NeurIPS 2019 • Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborová
Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones. Here we present a quantitative theory explaining this behaviour in a spiked matrix-tensor model. Our framework is based on the Kac-Rice analysis of stationary points and a closed-form analysis of gradient-flow originating from statistical physics.
no code implementations • 8 Apr 2018 • Valentina Ros, Gerard Ben Arous, Giulio Biroli, Chiara Cammarota
We study rough high-dimensional landscapes in which an increasingly stronger preference for a given configuration emerges.
no code implementations • 21 Dec 2018 • Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová
Gradient-descent-based algorithms and their stochastic versions have widespread applications in machine learning and statistical inference.
no code implementations • ICML 2018 • Marco Baity-Jesi, Levent Sagun, Mario Geiger, Stefano Spigler, Gerard Ben Arous, Chiara Cammarota, Yann Lecun, Matthieu Wyart, Giulio Biroli
We analyze numerically the training dynamics of deep neural networks (DNN) by using methods developed in statistical physics of glassy systems.
no code implementations • 29 May 2019 • Giulio Biroli, Chiara Cammarota, Federico Ricci-Tersenghi
In many high-dimensional estimation problems the main task consists in minimizing a cost function, which is often strongly non-convex when scanned in the space of parameters to be estimated.
no code implementations • 18 Jul 2019 • Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborová
Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones.
no code implementations • NeurIPS 2020 • Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová
Despite the widespread use of gradient-based algorithms for optimizing high-dimensional non-convex functions, understanding their ability of finding good minima instead of being trapped in spurious ones remains to a large extent an open problem.
no code implementations • 23 Sep 2020 • Andrea Marcello Mambuca, Chiara Cammarota, Izaak Neri
We show that, in general, linear dynamical systems defined on random graphs with a prescribed degree distribution of unbounded support are unstable if they are large enough, implying a tradeoff between stability and diversity.
Statistical Mechanics Disordered Systems and Neural Networks Populations and Evolution
no code implementations • 4 Mar 2024 • Tony Bonnaire, Giulio Biroli, Chiara Cammarota
Through both theoretical analysis and numerical experiments, we show that in practical cases, i. e. for finite but even very large $N$, successful optimization via gradient descent in phase retrieval is achieved by falling towards the good minima before reaching the bad ones.