1 code implementation • 16 Jul 2024 • Freya Behrens, Luca Biggio, Lenka Zdeborová
From a broader perspective, our analysis offers a framework to understand how the interaction of different architectural components of transformer models shapes diverse algorithmic solutions and approximations.
no code implementations • 10 Nov 2023 • Elior Benarous, Sotiris Anagnostidis, Luca Biggio, Thomas Hofmann
In this study, we investigate how neural networks exhibit shape bias during training on synthetic datasets, serving as an indicator of the synthetic data quality.
no code implementations • 31 May 2023 • Tommaso Bendinelli, Luca Biggio, Daniel Nyfeler, Abhigyan Ghosh, Peter Tollan, Moritz Alexander Kirschmann, Olga Fink
The value of luxury goods, particularly investment-grade gemstones, is greatly influenced by their origin and authenticity, sometimes resulting in differences worth millions of dollars.
no code implementations • NeurIPS 2023 • Sotiris Anagnostidis, Dario Pavllo, Luca Biggio, Lorenzo Noci, Aurelien Lucchi, Thomas Hofmann
Autoregressive Transformers adopted in Large Language Models (LLMs) are hard to scale to long sequences.
1 code implementation • 7 May 2023 • Venkat Nemani, Luca Biggio, Xun Huan, Zhen Hu, Olga Fink, Anh Tran, Yan Wang, Xiaoge Zhang, Chao Hu
In this tutorial, we aim to provide a holistic lens on emerging UQ methods for ML models with a particular focus on neural networks and the applications of these UQ methods in tackling engineering design as well as prognostics and health management problems.
no code implementations • 20 Apr 2023 • Tommaso Bendinelli, Luca Biggio, Pierre-Alexandre Kamienny
In symbolic regression, the goal is to find an analytical expression that accurately fits experimental data with the minimal use of mathematical symbols such as operators, variables, and constants.
no code implementations • 19 Jan 2023 • Enea Monzio Compagnoni, Luca Biggio, Antonio Orvieto, Frank Norbert Proske, Hans Kersting, Aurelien Lucchi
We study the SAM (Sharpness-Aware Minimization) optimizer which has recently attracted a lot of interest due to its increased performance over more classical variants of stochastic gradient descent.
no code implementations • 22 Nov 2022 • Sotiris Anagnostidis, Arne Thomsen, Tomasz Kacprzak, Tilman Tröster, Luca Biggio, Alexandre Refregier, Thomas Hofmann
In this work, we aim to improve upon two-point statistics by employing a \textit{PointNet}-like neural network to regress the values of the cosmological parameters directly from point cloud data.
no code implementations • 7 Jun 2022 • Lorenzo Noci, Sotiris Anagnostidis, Luca Biggio, Antonio Orvieto, Sidak Pal Singh, Aurelien Lucchi
First, we show that rank collapse of the tokens' representations hinders training by causing the gradients of the queries and keys to vanish at initialization.
1 code implementation • 1 Jun 2022 • Luca Biggio, Tommaso Bendinelli, Chetan Kulkarni, Olga Fink
Electrochemical batteries are ubiquitous devices in our society.
2 code implementations • 26 Jan 2022 • Dimitri von Rütte, Luca Biggio, Yannic Kilcher, Thomas Hofmann
Generating music with deep neural networks has been an area of active research in recent years.
no code implementations • 2 Jan 2022 • Enea Monzio Compagnoni, Anna Scampicchio, Luca Biggio, Antonio Orvieto, Thomas Hofmann, Josef Teichmann
Many finance, physics, and engineering phenomena are modeled by continuous-time dynamical systems driven by highly irregular (stochastic) inputs.
no code implementations • NeurIPS Workshop DLDE 2021 • Enea Monzio Compagnoni, Luca Biggio, Antonio Orvieto
Time series analysis is a widespread task in Natural Sciences, Social Sciences and Engineering.
2 code implementations • 11 Jun 2021 • Luca Biggio, Tommaso Bendinelli, Alexander Neitz, Aurelien Lucchi, Giambattista Parascandolo
We procedurally generate an unbounded set of equations, and simultaneously pre-train a Transformer to predict the symbolic equation from a corresponding set of input-output-pairs.
no code implementations • 8 Apr 2021 • Luca Biggio, Alexander Wieland, Manuel Arias Chao, Iason Kastanis, Olga Fink
Remaining Useful Life (RUL) estimation is the problem of inferring how long a certain industrial asset can be expected to operate within its defined specifications.
1 code implementation • NeurIPS Workshop LMCA 2020 • Luca Biggio, Tommaso Bendinelli, Aurelien Lucchi, Giambattista Parascandolo
Deep neural networks have proved to be powerful function approximators.