no code implementations • 17 Apr 2023 • Irene Balelli, Aude Sportisse, Francesco Cremonesi, Pierre-Alexandre Mattei, Marco Lorenzi
In addition, thanks to the variational nature of Fed-MIWAE, our method is designed to perform multiple imputation, allowing for the quantification of the imputation uncertainty in the federated scenario.
no code implementations • 15 Feb 2023 • Aude Sportisse, Hugo Schmutz, Olivier Humbert, Charles Bouveyron, Pierre-Alexandre Mattei
Semi-supervised learning is a powerful technique for leveraging unlabeled data to improve machine learning models, but it can be affected by the presence of ``informative'' labels, which occur when some classes are more likely to be labeled than others.
1 code implementation • 20 Dec 2021 • Aude Sportisse, Matthieu Marbac, Fabien Laporte, Gilles Celeux, Claire Boyer, Julie Josse, Christophe Biernacki
In this paper, we propose model-based clustering algorithms designed to handle very general types of missing data, including MNAR data.
no code implementations • NeurIPS 2020 • Aude Sportisse, Claire Boyer, Aymeric Dieuleveut, Julie Josses
Stochastic gradient algorithm is a key ingredient of many machine learning methods, particularly appropriate for large-scale learning.
1 code implementation • 12 May 2020 • Pascaline Descloux, Claire Boyer, Julie Josse, Aude Sportisse, Sylvain Sardy
The use of Robust Lasso-Zero is showcased for variable selection with missing values in the covariates.
1 code implementation • NeurIPS 2020 • Aude Sportisse, Claire Boyer, Julie Josse
Considering a data matrix generated from a probabilistic principal component analysis (PPCA) model containing several MNAR variables, not necessarily under the same self-masked missing mechanism, we propose estimators for the means, variances and covariances of the variables and study their consistency.
Statistics Theory Statistics Theory
1 code implementation • 29 Dec 2018 • Aude Sportisse, Claire Boyer, Julie Josse
Our second contribution is to suggest a computationally efficient surrogate estimation by implicitly taking into account the joint distribution of the data and the missing mechanism: the data matrix is concatenated with the mask coding for the missing values; a low-rank structure for exponential family is assumed on this new matrix, in order to encode links between variables and missing mechanisms.