Search Results for author: Stefano Sarao Mannelli

Found 18 papers, 2 papers with code

Why Do Animals Need Shaping? A Theory of Task Composition and Curriculum Learning

no code implementations28 Feb 2024 Jin Hwa Lee, Stefano Sarao Mannelli, Andrew Saxe

Diverse studies in systems neuroscience begin with extended periods of training known as 'shaping' procedures.

reinforcement-learning

Optimal transfer protocol by incremental layer defrosting

no code implementations2 Mar 2023 Federica Gerace, Diego Doimo, Stefano Sarao Mannelli, Luca Saglietti, Alessandro Laio

The simplest transfer learning protocol is based on ``freezing" the feature-extractor layers of a network pre-trained on a data-rich source task, and then adapting only the last layers to a data-poor target task.

Transfer Learning

Bias-inducing geometries: an exactly solvable data model with fairness implications

no code implementations31 May 2022 Stefano Sarao Mannelli, Federica Gerace, Negar Rostamzadeh, Luca Saglietti

Then, we consider a novel mitigation strategy based on a matched inference approach, consisting in the introduction of coupled learning models.

Fairness

Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation

1 code implementation18 May 2022 Sebastian Lee, Stefano Sarao Mannelli, Claudia Clopath, Sebastian Goldt, Andrew Saxe

Continual learning - learning new tasks in sequence while maintaining performance on old tasks - remains particularly challenging for artificial neural networks.

Continual Learning

An Analytical Theory of Curriculum Learning in Teacher-Student Networks

no code implementations15 Jun 2021 Luca Saglietti, Stefano Sarao Mannelli, Andrew Saxe

To study the former, we provide an exact description of the online learning setting, confirming the long-standing experimental observation that curricula can modestly speed up learning.

Probing transfer learning with a model of synthetic correlated datasets

no code implementations9 Jun 2021 Federica Gerace, Luca Saglietti, Stefano Sarao Mannelli, Andrew Saxe, Lenka Zdeborová

Transfer learning can significantly improve the sample efficiency of neural networks, by exploiting the relatedness between a data-scarce target task and a data-abundant source task.

Binary Classification Transfer Learning

Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problems

no code implementations NeurIPS 2021 Stefano Sarao Mannelli, Pierfrancesco Urbani

The optimization step in many machine learning problems rarely relies on vanilla gradient descent but it is common practice to use momentum-based accelerated methods.

Numerical Integration

Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions

no code implementations NeurIPS 2020 Stefano Sarao Mannelli, Eric Vanden-Eijnden, Lenka Zdeborová

We consider a teacher-student scenario where the teacher has the same structure as the student with a hidden layer of smaller width $m^*\le m$.

Post-Workshop Report on Science meets Engineering in Deep Learning, NeurIPS 2019, Vancouver

no code implementations25 Jun 2020 Levent Sagun, Caglar Gulcehre, Adriana Romero, Negar Rostamzadeh, Stefano Sarao Mannelli

Science meets Engineering in Deep Learning took place in Vancouver as part of the Workshop section of NeurIPS 2019.

Winning the competition: enhancing counter-contagion in SIS-like epidemic processes

no code implementations24 Jun 2020 Argyris Kalogeratos, Stefano Sarao Mannelli

In this paper we consider the epidemic competition between two generic diffusion processes, where each competing side is represented by a different state of a stochastic process.

Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

no code implementations NeurIPS 2020 Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

Despite the widespread use of gradient-based algorithms for optimizing high-dimensional non-convex functions, understanding their ability of finding good minima instead of being trapped in spurious ones remains to a large extent an open problem.

Retrieval

Thresholds of descending algorithms in inference problems

no code implementations2 Jan 2020 Stefano Sarao Mannelli, Lenka Zdeborova

We review recent works on analyzing the dynamics of gradient-based algorithms in a prototypical statistical inference problem.

Who is Afraid of Big Bad Minima? Analysis of gradient-flow in spiked matrix-tensor models

1 code implementation NeurIPS 2019 Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborová

Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones. Here we present a quantitative theory explaining this behaviour in a spiked matrix-tensor model. Our framework is based on the Kac-Rice analysis of stationary points and a closed-form analysis of gradient-flow originating from statistical physics.

Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model

no code implementations18 Jul 2019 Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborová

Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones.

Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models

no code implementations1 Feb 2019 Stefano Sarao Mannelli, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

In this work we analyse quantitatively the interplay between the loss landscape and performance of descent algorithms in a prototypical inference problem, the spiked matrix-tensor model.

Marvels and Pitfalls of the Langevin Algorithm in Noisy High-dimensional Inference

no code implementations21 Dec 2018 Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

Gradient-descent-based algorithms and their stochastic versions have widespread applications in machine learning and statistical inference.

Cannot find the paper you are looking for? You can Submit a new open access paper.