We also propose and analyze an estimator based on Richardson extrapolation of the Sinkhorn divergence which enjoys improved statistical and computational efficiency guarantees, under a condition on the regularity of the approximation error, which is in particular satisfied for Gaussian densities.
Minimizing a convex function of a measure with a sparsity-inducing penalty is a typical problem arising, e. g., in sparse spikes deconvolution or two-layer neural networks training.
In a series of recent theoretical works, it was shown that strongly over-parameterized neural networks trained with gradient-based methods could converge exponentially fast to zero training loss, with their parameters hardly varying.
This article introduces a new class of fast algorithms to approximate variational problems involving unbalanced optimal transport.
Optimization and Control 65K10
These distances are defined by two equivalent alternative formulations: (i) a "fluid dynamic" formulation defining the distance as a geodesic distance over the space of measures (ii) a static "Kantorovich" formulation where the distance is the minimum of an optimization program over pairs of couplings describing the transfer (transport, creation and destruction) of mass between two measures.
Optimization and Control
This metric interpolates between the quadratic Wasserstein and the Fisher-Rao metrics and generalizes optimal transport to measures with different masses.
Analysis of PDEs