no code implementations • 28 May 2025 • Razvan-Andrei Lascu, Mateusz B. Majka
We study the problem of minimizing non-convex functionals on the space of probability measures, regularized by the relative entropy (KL divergence) with respect to a fixed reference measure, as well as the corresponding problem of solving entropy-regularized non-convex-non-concave min-max problems.
no code implementations • 22 Nov 2024 • Razvan-Andrei Lascu, Mateusz B. Majka, David Šiška, Łukasz Szpruch
Since the relative entropy is not Wasserstein differentiable, we prove that along the scheme the iterates belong to a certain class of Sobolev regularity, and hence the relative entropy $\operatorname{KL}(\cdot|\pi)$ has a unique Wasserstein sub-gradient, and that the relative Fisher information is indeed finite.
no code implementations • 24 May 2024 • Razvan-Andrei Lascu, Mateusz B. Majka, Łukasz Szpruch
Gradient flows play a substantial role in addressing many machine learning problems.
no code implementations • 12 Feb 2024 • Razvan-Andrei Lascu, Mateusz B. Majka, Łukasz Szpruch
We study two variants of the mirror descent-ascent algorithm for solving min-max problems on the space of measures: simultaneous and sequential.
1 code implementation • 10 Jun 2020 • Mateusz B. Majka, Marc Sabate-Vidales, Łukasz Szpruch
In this paper, we construct a Multi-index Antithetic Stochastic Gradient Algorithm (MASGA) whose implementation is independent of the structure of the target measure and which achieves performance on par with Monte Carlo estimators that have access to unbiased samples from the distribution of interest.
no code implementations • 21 Aug 2018 • Mateusz B. Majka, Aleksandar Mijatović, Lukasz Szpruch
Finally, we provide a weak convergence analysis that covers both the standard and the randomised (inaccurate) drift case.
no code implementations • 4 May 2016 • Michael B. Giles, Mateusz B. Majka, Lukasz Szpruch, Sebastian Vollmer, Konstantinos Zygalakis
We show that this is the first stochastic gradient MCMC method with complexity $\mathcal{O}(\varepsilon^{-2}|\log {\varepsilon}|^{3})$, in contrast to the complexity $\mathcal{O}(\varepsilon^{-3})$ of currently available methods.