no code implementations • ICML 2020 • Robert Mattila, Cristian Rojas, Eric Moulines, Vikram Krishnamurthy, Bo Wahlberg
Can the parameters of a hidden Markov model (HMM) be estimated from a single sweep through the observations -- and additionally, without being trapped at a local optimum in the likelihood surface?
no code implementations • 16 May 2022 • Daniil Tiapkin, Denis Belomestny, Eric Moulines, Alexey Naumov, Sergey Samsonov, Yunhao Tang, Michal Valko, Pierre Menard
We propose the Bayes-UCBVI algorithm for reinforcement learning in tabular, stage-dependent, episodic Markov decision process: a natural extension of the Bayes-UCB algorithm by Kaufmann et al. (2012) for multi-armed bandits.
no code implementations • 10 Feb 2022 • Max Cohen, Guillaume Quispe, Sylvain Le Corff, Charles Ollion, Eric Moulines
In this work, we propose a new model to train the prior and the encoder/decoder networks simultaneously.
no code implementations • NeurIPS 2021 • Aymeric Dieuleveut, Gersende Fort, Eric Moulines, Geneviève Robin
The Expectation Maximization (EM) algorithm is the default algorithm for inference in latent variable models.
1 code implementation • NeurIPS 2021 • Achille Thin, Yazid Janati El Idrissi, Sylvain Le Corff, Charles Ollion, Eric Moulines, Arnaud Doucet, Alain Durmus, Christian Robert
Sampling from a complex distribution $\pi$ and approximating its intractable normalizing constant $\mathrm{Z}$ are challenging problems.
no code implementations • 4 Nov 2021 • Evgeny Lagutin, Daniil Selikhanovych, Achille Thin, Sergey Samsonov, Alexey Naumov, Denis Belomestny, Maxim Panov, Eric Moulines
We develop an Explore-Exploit Markov chain Monte Carlo algorithm ($\operatorname{Ex^2MCMC}$) that combines multiple global proposals and local moves.
no code implementations • 3 Nov 2021 • Aymeric Dieuleveut, Gersende Fort, Eric Moulines, Geneviève Robin
The Expectation Maximization (EM) algorithm is the default algorithm for inference in latent variable models.
1 code implementation • 30 Jun 2021 • Achille Thin, Nikita Kotelevskii, Arnaud Doucet, Alain Durmus, Eric Moulines, Maxim Panov
Variational auto-encoders (VAE) are popular deep latent variable models which are trained by maximizing an Evidence Lower Bound (ELBO).
no code implementations • 11 Jun 2021 • Vincent Plassier, Maxime Vono, Alain Durmus, Eric Moulines
Performing reliable Bayesian inference on a big data scale is becoming a keystone in the modern era of machine learning.
no code implementations • NeurIPS 2021 • Alain Durmus, Eric Moulines, Alexey Naumov, Sergey Samsonov, Kevin Scaman, Hoi-To Wai
This family of methods arises in many machine learning tasks and is used to obtain approximate solutions of a linear system $\bar{A}\theta = \bar{b}$ for which $\bar{A}$ and $\bar{b}$ can only be accessed through random estimates $\{({\bf A}_n, {\bf b}_n): n \in \mathbb{N}^*\}$.
no code implementations • 1 Jun 2021 • Maxime Vono, Vincent Plassier, Alain Durmus, Aymeric Dieuleveut, Eric Moulines
The objective of Federated Learning (FL) is to perform statistical inference for data which are decentralised and stored locally on networked clients.
no code implementations • 25 May 2021 • Gersende Fort, Eric Moulines
Incremental Expectation Maximization (EM) algorithms were introduced to design EM for the large scale learning framework by avoiding the full data set to be processed at each iteration.
no code implementations • NeurIPS 2021 • Louis Leconte, Aymeric Dieuleveut, Edouard Oyallon, Eric Moulines, Gilles Pages
The growing size of models and datasets have made distributed implementation of stochastic gradient descent (SGD) an active field of research.
1 code implementation • 17 Mar 2021 • Achille Thin, Yazid Janati, Sylvain Le Corff, Charles Ollion, Arnaud Doucet, Alain Durmus, Eric Moulines, Christian Robert
Sampling from a complex distribution $\pi$ and approximating its intractable normalizing constant Z are challenging problems.
no code implementations • 30 Jan 2021 • Denis Belomestny, Eric Moulines, Alexey Naumov, Nikita Puchkin, Sergey Samsonov
In this work we undertake a thorough study of the non-asymptotic properties of the vanilla generative adversarial networks (GANs).
no code implementations • 30 Jan 2021 • Alain Durmus, Eric Moulines, Alexey Naumov, Sergey Samsonov, Hoi-To Wai
This paper studies the exponential stability of random matrix products driven by a general (possibly unbounded) state space Markov chain.
no code implementations • 1 Jan 2021 • Belhal Karimi, Hoi To Wai, Eric Moulines, Ping Li
Many constrained, nonconvex and nonsmooth optimization problems can be tackled using the majorization-minimization (MM) method which alternates between constructing a surrogate function which upper bounds the objective function, and then minimizing this surrogate.
no code implementations • 31 Dec 2020 • Achille Thin, Nikita Kotelevskii, Christophe Andrieu, Alain Durmus, Eric Moulines, Maxim Panov
This paper fills the gap by developing general tools to ensure that a class of nonreversible Markov kernels, possibly relying on complex transforms, has the desired invariance property and leads to convergent algorithms.
no code implementations • NeurIPS 2020 • Gersende Fort, Eric Moulines, Hoi-To Wai
The Expectation Maximization (EM) algorithm is of key importance for inference in latent variable models including mixture of regressors and experts, missing observations.
no code implementations • 30 Nov 2020 • Gersende Fort, Eric Moulines, Hoi-To Wai
The Expectation Maximization (EM) algorithm is of key importance for inference in latent variable models including mixture of regressors and experts, missing observations.
no code implementations • 24 Nov 2020 • Gersende Fort, Eric Moulines, Hoi-To Wai
The Expectation Maximization (EM) algorithm is a key reference for inference in latent variable models; unfortunately, its computational cost is prohibitive in the large scale learning setting.
no code implementations • 27 Feb 2020 • Achille Thin, Nikita Kotelevskii, Jean-Stanislas Denain, Leo Grinsztajn, Alain Durmus, Maxim Panov, Eric Moulines
In this contribution, we propose a new computationally efficient method to combine Variational Inference (VI) with Markov Chain Monte Carlo (MCMC).
no code implementations • 4 Feb 2020 • Maxim Kaledin, Eric Moulines, Alexey Naumov, Vladislav Tadic, Hoi-To Wai
Our bounds show that there is no discrepancy in the convergence rate between Markovian and martingale noise, only the constants are affected by the mixing time of the Markov chain.
no code implementations • NeurIPS 2019 • Belhal Karimi, Hoi-To Wai, Eric Moulines, Marc Lavielle
To alleviate this problem, Neal and Hinton have proposed an incremental version of the EM (iEM) in which at each iteration the conditional expectation of the latent data (E-step) is updated only for a mini-batch of observations.
no code implementations • 2 Feb 2019 • Belhal Karimi, Blazej Miasojedow, Eric Moulines, Hoi-To Wai
We illustrate these settings with the online EM algorithm and the policy-gradient method for average reward maximization in reinforcement learning.
no code implementations • NeurIPS 2018 • Nicolas Brosse, Alain Durmus, Eric Moulines
As $N$ becomes large, we show that the SGLD algorithm has an invariant probability measure which significantly departs from the target posterior and behaves like Stochastic Gradient Descent (SGD).
no code implementations • 5 Dec 2016 • Hoi-To Wai, Jean Lafond, Anna Scaglione, Eric Moulines
The convergence of the proposed algorithm is studied by viewing the decentralized algorithm as an inexact FW algorithm.
no code implementations • NeurIPS 2016 • Alain Durmus, Umut Simsekli, Eric Moulines, Roland Badeau, Gaël Richard
We illustrate our framework on the popular Stochastic Gradient Langevin Dynamics (SGLD) algorithm and propose a novel SG-MCMC algorithm referred to as Stochastic Gradient Richardson-Romberg Langevin Dynamics (SGRRLD).
no code implementations • 5 May 2016 • Alain Durmus, Eric Moulines
We consider in this paper the problem of sampling a high-dimensional probability distribution $\pi$ having a density with respect to the Lebesgue measure on $\mathbb{R}^d$, known up to a normalization constant $x \mapsto \pi(x)= \mathrm{e}^{-U(x)}/\int_{\mathbb{R}^d} \mathrm{e}^{-U(y)} \mathrm{d} y$.
no code implementations • 5 Oct 2015 • Jean Lafond, Hoi-To Wai, Eric Moulines
With a strongly convex stochastic cost and when the optimal solution lies in the interior of the constraint set or the constraint set is a polytope, the regret bound and anytime optimality are shown to be ${\cal O}( \log^3 T / T )$ and ${\cal O}( \log^2 T / T)$, respectively, where $T$ is the number of rounds played.
no code implementations • NeurIPS 2014 • Jean Lafond, Olga Klopp, Eric Moulines, Jospeh Salmon
The task of reconstructing a matrix given a sample of observedentries is known as the matrix completion problem.
no code implementations • 26 Aug 2014 • Olga Klopp, Jean Lafond, Eric Moulines, Joseph Salmon
The task of estimating a matrix given a sample of observed entries is known as the \emph{matrix completion problem}.
no code implementations • NeurIPS 2013 • Francis Bach, Eric Moulines
We consider the stochastic approximation problem where a convex function has to be minimized, given only the knowledge of unbiased estimates of its gradients at certain points, a framework which includes machine learning methods based on the minimization of the empirical risk.
2 code implementations • 4 May 2012 • Blazej Miasojedow, Eric Moulines, Matti Vihola
Parallel tempering is a generic Markov chain Monte Carlo sampling method which allows good mixing with multimodal target distributions, where conventional Metropolis-Hastings algorithms often fail.
Computation
no code implementations • NeurIPS 2011 • Eric Moulines, Francis R. Bach
We consider the minimization of a convex objective function defined on a Hilbert space, which is only available through unbiased estimates of its gradients.
no code implementations • NeurIPS 2008 • Zaïd Harchaoui, Eric Moulines, Francis R. Bach
Change-point analysis of an (unlabelled) sample of observations consists in, first, testing whether a change in the distribution occurs within the sample, and second, if a change occurs, estimating the change-point instant after which the distribution of the observations switches from one distribution to another different distribution.
no code implementations • 27 Dec 2007 • Olivier Cappé, Eric Moulines
The resulting algorithm is usually simpler and is shown to achieve convergence to the stationary points of the Kullback-Leibler divergence between the marginal distribution of the observation and the model distribution at the optimal rate, i. e., that of the maximum likelihood estimator.