1 code implementation • 8 Oct 2024 • Jérôme Bolte, Ryan Boustany, Edouard Pauwels, Andrei Purica
In this empirical article, we introduce INNAprop, an optimization algorithm that combines the INNA method with the RMSprop adaptive gradient scaling.
no code implementations • 24 May 2024 • Franck Iutzeler, Edouard Pauwels, Samuel Vaiter
We investigate the behavior of the derivatives of the iterates of Stochastic Gradient Descent (SGD) with respect to that parameter and show that they are driven by an inexact SGD recursion on a different objective function, perturbed by the convergence of the original SGD.
no code implementations • 30 Apr 2024 • Jérôme Bolte, Tam Le, Éric Moulines, Edouard Pauwels
Motivated by the widespread use of approximate derivatives in machine learning and optimization, we study inexact subgradient methods with non-vanishing additive errors and step sizes.
no code implementations • 15 Dec 2022 • Jérôme Bolte, Edouard Pauwels, Antonio José Silveti-Falls
We leverage path differentiability and a recent result on nonsmooth implicit differentiation calculus to give sufficient conditions ensuring that the solution to a monotone inclusion problem will be path differentiable, with formulas for computing its generalized gradient.
no code implementations • 26 Jul 2022 • Edouard Pauwels, Samuel Vaiter
We show that the derivatives of the Sinkhorn-Knopp algorithm, or iterative proportional fitting procedure, converge towards the derivatives of the entropic regularization of the optimal transport problem with a locally uniform linear convergence rate.
no code implementations • 1 Jun 2022 • Jérôme Bolte, Ryan Boustany, Edouard Pauwels, Béatrice Pesquet-Popescu
Using the notion of conservative gradient, we provide a simple model to estimate the computational costs of the backward and forward modes of algorithmic differentiation for a wide class of nonsmooth programs.
no code implementations • 31 May 2022 • Jérôme Bolte, Edouard Pauwels, Samuel Vaiter
Is there a limiting object for nonsmooth piggyback automatic differentiation (AD)?
no code implementations • 11 Jan 2022 • Swann Marx, Edouard Pauwels
We consider flows of ordinary differential equations (ODEs) driven by path differentiable vector fields.
1 code implementation • NeurIPS 2021 • David Bertoin, Jérôme Bolte, Sébastien Gerchinovitz, Edouard Pauwels
In theory, the choice of ReLU(0) in [0, 1] for a neural network has a negligible influence both on backpropagation and training.
no code implementations • NeurIPS 2021 • Jérôme Bolte, Tam Le, Edouard Pauwels, Antonio Silveti-Falls
In view of training increasingly complex learning architectures, we establish a nonsmooth implicit function theorem with an operational calculus.
no code implementations • NeurIPS 2021 • Jerome Bolte, Tam Le, Edouard Pauwels, Antonio Silveti-Falls
In view of training increasingly complex learning architectures, we establish a nonsmooth implicit function theorem with an operational calculus.
no code implementations • NeurIPS 2021 • David Bertoin, Jerome Bolte, Sébastien Gerchinovitz, Edouard Pauwels
In theory, the choice of ReLU(0) in [0, 1] for a neural network has a negligible influence both on backpropagation and training.
1 code implementation • 5 Mar 2021 • Camille Castera, Jérôme Bolte, Cédric Févotte, Edouard Pauwels
In view of a direct and simple improvement of vanilla SGD, this paper presents a fine-tuning of its step-sizes in the mini-batch case.
1 code implementation • 13 Jan 2021 • Tong Chen, Jean-Bernard Lasserre, Victor Magron, Edouard Pauwels
We introduce a sublevel Moment-SOS hierarchy where each SDP relaxation can be viewed as an intermediate (or interpolation) between the d-th and (d+1)-th order SDP relaxations of the Moment-SOS hierarchy (dense or sparse version).
Combinatorial Optimization Optimization and Control
no code implementations • 24 Nov 2020 • Cheik Traoré, Edouard Pauwels
We prove that the iterates produced by, either the scalar step size variant, or the coordinatewise variant of AdaGrad algorithm, are convergent sequences when applied to convex objective functions with Lipschitz gradient.
no code implementations • 17 Jul 2020 • Jérôme Bolte, Lilian Glaudin, Edouard Pauwels, Mathieu Serrurier
We present a new algorithm to solve min-max or min-min problems out of the convex world.
no code implementations • 15 Jul 2020 • Edouard Pauwels
Minibatch decomposition methods for empirical risk minimization are commonly analysed in a stochastic approximation setting, also known as sampling with replacement.
no code implementations • NeurIPS 2020 • Jerome Bolte, Edouard Pauwels
Automatic differentiation, as implemented today, does not have a simple mathematical model adapted to the needs of modern machine learning.
2 code implementations • NeurIPS 2020 • Tong Chen, Jean-Bernard Lasserre, Victor Magron, Edouard Pauwels
The Lipschitz constant of a network plays an important role in many applications of deep learning, such as robustness certification and Wasserstein Generative Adversarial Network.
no code implementations • 31 Oct 2019 • Mai Trang Vu, François Bachoc, Edouard Pauwels
We consider the problem of estimating the support of a measure from a finite, independent, sample.
no code implementations • 25 Sep 2019 • serrurier Mathieu, Loubes Jean-Michel, Edouard Pauwels
For both models, we devise a learning algorithm based on approximation of Wasserstein distances using adversarial networks.
no code implementations • 23 Sep 2019 • Jérôme Bolte, Edouard Pauwels
Modern problems in AI or in numerical analysis require nonsmooth approaches with a flexible calculus.
2 code implementations • 29 May 2019 • Camille Castera, Jérôme Bolte, Cédric Févotte, Edouard Pauwels
We prove the convergence of INNA for most deep learning problems.
no code implementations • 19 Oct 2018 • Edouard Pauwels, Mihai Putinar, Jean-Bernard Lasserre
Spectral features of the empirical moment matrix constitute a resourceful tool for unveiling properties of a cloud of points, among which, density, support and latent structures.
no code implementations • NeurIPS 2018 • Edouard Pauwels, Francis Bach, Jean-Philippe Vert
Statistical leverage scores emerged as a fundamental tool for matrix sketching and column sampling with applications to low rank approximation, regression, random feature learning and quadrature.
no code implementations • 11 Jan 2017 • Jean-Bernard Lasserre, Edouard Pauwels
Secondly, we provide a consistency result which relates the empirical Christoffel function and its population counterpart in the limit of large samples.
no code implementations • NeurIPS 2016 • Jean-Bernard Lasserre, Edouard Pauwels
In fact, this SOS polynomial is directly related to orthogonal polynomials and the Christoffel function.