1 code implementation • 30 Nov 2023 • Lénaïc Chizat, Praneeth Netrapalli
Deep learning succeeds by doing hierarchical feature learning, yet tuning Hyper-Parameters (HP) such as initialization scales, learning rates etc., only give indirect control over this behavior.
no code implementations • 25 Jul 2023 • Lénaïc Chizat, Tomas Vaškevičius
We study the computation of doubly regularized Wasserstein barycenters, a recently introduced family of entropic barycenters governed by inner and outer regularization strengths.
no code implementations • NeurIPS 2023 • Guillaume Wang, Lénaïc Chizat
We show that these dynamics also converge as soon as $S$ is nonzero (partial curvature) and the eigenvectors of the antisymmetric part $A$ are in general position with respect to the kernel of $S$.
no code implementations • 31 Mar 2023 • Sebastian Neumayer, Lénaïc Chizat, Michael Unser
In supervised learning, the regularization path is sometimes used as a convenient theoretical proxy for the optimization path of gradient descent initialized from zero.
1 code implementation • 21 Mar 2023 • Lénaïc Chizat
In particular, it can be estimated efficiently: given $n$ samples from each of the probability measures, it converges in relative entropy to the population barycenter at a rate $n^{-1/2}$.
1 code implementation • 29 Nov 2022 • Lénaïc Chizat, Maria Colombo, Xavier Fernández-Real, Alessio Figalli
We finally study the continuous-time limit obtained for infinitely wide linear neural networks and show that the linear predictors of the neural network converge at an exponential rate to the minimal $\ell_2$-norm minimizer of the risk.
1 code implementation • 2 Nov 2022 • Guillaume Wang, Lénaïc Chizat
We consider the problem of computing mixed Nash equilibria of two-player zero-sum games with continuous sets of pure strategies and with first-order access to the payoff function.
1 code implementation • 14 May 2022 • Lénaïc Chizat, Stephen Zhang, Matthieu Heitz, Geoffrey Schiebinger
Trajectory inference aims at recovering the dynamics of a population from snapshots of its temporal marginals.
1 code implementation • 29 Oct 2021 • Karl Hajjar, Lénaïc Chizat, Christophe Giraud
For two-layer neural networks, it has been understood via these asymptotics that the nature of the trained model radically changes depending on the scale of the initial random weights, ranging from a kernel regime (for large initial variance) to a feature learning regime (for small initial variance).
1 code implementation • NeurIPS 2020 • Kimia Nadjahi, Alain Durmus, Lénaïc Chizat, Soheil Kolouri, Shahin Shahrampour, Umut Şimşekli
The idea of slicing divergences has been proven to be successful when comparing two probability measures in various machine learning applications including generative modeling, and consists in computing the expected value of a `base divergence' between one-dimensional random projections of the two measures.
no code implementations • 16 Mar 2019 • Jean-Luc Peyrot, Laurent Duval, Frédéric Payan, Lauriane Bouard, Lénaïc Chizat, Sébastien Schneider, Marc Antonini
Efficient representations and storage are thus becoming "enabling technologies'' in HPC experimental and simulation science.