Search Results for author: Stefano Sarao Mannelli

Found 18 papers, 2 papers with code

Why Do Animals Need Shaping? A Theory of Task Composition and Curriculum Learning

no code implementations • 28 Feb 2024 • Jin Hwa Lee, Stefano Sarao Mannelli, Andrew Saxe

Diverse studies in systems neuroscience begin with extended periods of training known as 'shaping' procedures.

Paper
Add Code

The RL Perceptron: Generalisation Dynamics of Policy Learning in High Dimensions

no code implementations • 17 Jun 2023 • Nishil Patel, Sebastian Lee, Stefano Sarao Mannelli, Sebastian Goldt, Andrew Saxe

Reinforcement learning (RL) algorithms have proven transformative in a range of domains.

Atari Games Reinforcement Learning (RL)

Paper
Add Code

Optimal transfer protocol by incremental layer defrosting

no code implementations • 2 Mar 2023 • Federica Gerace, Diego Doimo, Stefano Sarao Mannelli, Luca Saglietti, Alessandro Laio

The simplest transfer learning protocol is based on ``freezing" the feature-extractor layers of a network pre-trained on a data-rich source task, and then adapting only the last layers to a data-poor target task.

Transfer Learning

Paper
Add Code

Bias-inducing geometries: an exactly solvable data model with fairness implications

no code implementations • 31 May 2022 • Stefano Sarao Mannelli, Federica Gerace, Negar Rostamzadeh, Luca Saglietti

Then, we consider a novel mitigation strategy based on a matched inference approach, consisting in the introduction of coupled learning models.

Fairness

Paper
Add Code

Maslow's Hammer for Catastrophic Forgetting: Node Re-Use vs Node Activation

1 code implementation • 18 May 2022 • Sebastian Lee, Stefano Sarao Mannelli, Claudia Clopath, Sebastian Goldt, Andrew Saxe

Continual learning - learning new tasks in sequence while maintaining performance on old tasks - remains particularly challenging for artificial neural networks.

Continual Learning

Paper
Code

An Analytical Theory of Curriculum Learning in Teacher-Student Networks

no code implementations • 15 Jun 2021 • Luca Saglietti, Stefano Sarao Mannelli, Andrew Saxe

To study the former, we provide an exact description of the online learning setting, confirming the long-standing experimental observation that curricula can modestly speed up learning.

Paper
Add Code

Probing transfer learning with a model of synthetic correlated datasets

no code implementations • 9 Jun 2021 • Federica Gerace, Luca Saglietti, Stefano Sarao Mannelli, Andrew Saxe, Lenka Zdeborová

Transfer learning can significantly improve the sample efficiency of neural networks, by exploiting the relatedness between a data-scarce target task and a data-abundant source task.

Binary Classification Transfer Learning

Paper
Add Code

Analytical Study of Momentum-Based Acceleration Methods in Paradigmatic High-Dimensional Non-Convex Problems

no code implementations • NeurIPS 2021 • Stefano Sarao Mannelli, Pierfrancesco Urbani

The optimization step in many machine learning problems rarely relies on vanilla gradient descent but it is common practice to use momentum-based accelerated methods.

Numerical Integration

Paper
Add Code

Epidemic mitigation by statistical inference from contact tracing data

no code implementations • 20 Sep 2020 • Antoine Baker, Indaco Biazzo, Alfredo Braunstein, Giovanni Catania, Luca Dall'Asta, Alessandro Ingrosso, Florent Krzakala, Fabio Mazza, Marc Mézard, Anna Paola Muntoni, Maria Refinetti, Stefano Sarao Mannelli, Lenka Zdeborová

We conclude that probabilistic risk estimation is capable to enhance performance of digital contact tracing and should be considered in the currently developed mobile applications.

Bayesian Inference Privacy Preserving

Paper
Add Code

Optimization and Generalization of Shallow Neural Networks with Quadratic Activation Functions

no code implementations • NeurIPS 2020 • Stefano Sarao Mannelli, Eric Vanden-Eijnden, Lenka Zdeborová

We consider a teacher-student scenario where the teacher has the same structure as the student with a hidden layer of smaller width $m^*\le m$.

Paper
Add Code

Post-Workshop Report on Science meets Engineering in Deep Learning, NeurIPS 2019, Vancouver

no code implementations • 25 Jun 2020 • Levent Sagun, Caglar Gulcehre, Adriana Romero, Negar Rostamzadeh, Stefano Sarao Mannelli

Science meets Engineering in Deep Learning took place in Vancouver as part of the Workshop section of NeurIPS 2019.

Paper
Add Code

Winning the competition: enhancing counter-contagion in SIS-like epidemic processes

no code implementations • 24 Jun 2020 • Argyris Kalogeratos, Stefano Sarao Mannelli

In this paper we consider the epidemic competition between two generic diffusion processes, where each competing side is represented by a different state of a stochastic process.

Paper
Add Code

Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval

no code implementations • NeurIPS 2020 • Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

Despite the widespread use of gradient-based algorithms for optimizing high-dimensional non-convex functions, understanding their ability of finding good minima instead of being trapped in spurious ones remains to a large extent an open problem.

Retrieval

Paper
Add Code

Thresholds of descending algorithms in inference problems

no code implementations • 2 Jan 2020 • Stefano Sarao Mannelli, Lenka Zdeborova

We review recent works on analyzing the dynamics of gradient-based algorithms in a prototypical statistical inference problem.

Paper
Add Code

Who is Afraid of Big Bad Minima? Analysis of gradient-flow in spiked matrix-tensor models

1 code implementation • NeurIPS 2019 • Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborová

Gradient-based algorithms are effective for many machine learning tasks, but despite ample recent effort and some progress, it often remains unclear why they work in practice in optimising high-dimensional non-convex functions and why they find good minima instead of being trapped in spurious ones. Here we present a quantitative theory explaining this behaviour in a spiked matrix-tensor model. Our framework is based on the Kac-Rice analysis of stationary points and a closed-form analysis of gradient-flow originating from statistical physics.

Paper
Code

Who is Afraid of Big Bad Minima? Analysis of Gradient-Flow in a Spiked Matrix-Tensor Model

no code implementations • 18 Jul 2019 • Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Lenka Zdeborová

Paper
Add Code

Passed & Spurious: Descent Algorithms and Local Minima in Spiked Matrix-Tensor Models

no code implementations • 1 Feb 2019 • Stefano Sarao Mannelli, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

In this work we analyse quantitatively the interplay between the loss landscape and performance of descent algorithms in a prototypical inference problem, the spiked matrix-tensor model.

Paper
Add Code

Marvels and Pitfalls of the Langevin Algorithm in Noisy High-dimensional Inference

no code implementations • 21 Dec 2018 • Stefano Sarao Mannelli, Giulio Biroli, Chiara Cammarota, Florent Krzakala, Pierfrancesco Urbani, Lenka Zdeborová

Gradient-descent-based algorithms and their stochastic versions have widespread applications in machine learning and statistical inference.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.