no code implementations • 12 Feb 2024 • Nathan Doumèche, Francis Bach, Claire Boyer, Gérard Biau
In this context, we consider a general regression problem where the empirical risk is regularized by a partial differential equation that quantifies the physical inconsistency.
no code implementations • 7 Feb 2024 • Stanislas Strasman, Antonio Ocello, Claire Boyer, Sylvain Le Corff, Vincent Lemaire
Under mild assumptions on the data distribution, we establish an upper bound for the KL divergence between the target and the estimated distributions, explicitly depending on any time-dependent noise schedule.
no code implementations • 6 Feb 2024 • Alexis Ayme, Claire Boyer, Aymeric Dieuleveut, Erwan Scornet
Constant (naive) imputation is still widely used in practice as this is a first easy-to-use technique to deal with missing data.
no code implementations • 30 Sep 2022 • Patrick Lutz, Ludovic Arnould, Claire Boyer, Erwan Scornet
Dedicated neural network (NN) architectures have been designed to handle specific data types (such as CNN for images or RNN for text), which ranks them among state-of-the-art methods for dealing with these data.
no code implementations • 3 Feb 2022 • Alexis Ayme, Claire Boyer, Aymeric Dieuleveut, Erwan Scornet
Missing values arise in most real-world data sets due to the aggregation of multiple sources and intrinsically missing information (sensor failure, unanswered questions in surveys...).
1 code implementation • 20 Dec 2021 • Aude Sportisse, Matthieu Marbac, Fabien Laporte, Gilles Celeux, Claire Boyer, Julie Josse, Christophe Biernacki
In this paper, we propose model-based clustering algorithms designed to handle very general types of missing data, including MNAR data.
no code implementations • NeurIPS 2020 • Aude Sportisse, Claire Boyer, Aymeric Dieuleveut, Julie Josses
Stochastic gradient algorithm is a key ingredient of many machine learning methods, particularly appropriate for large-scale learning.
no code implementations • 29 Oct 2020 • Ludovic Arnould, Claire Boyer, Erwan Scornet, Sorbonne Lpsm
Random forests on the one hand, and neural networks on the other hand, have met great success in the machine learning community for their predictive performance.
1 code implementation • 12 May 2020 • Pascaline Descloux, Claire Boyer, Julie Josse, Aude Sportisse, Sylvain Sardy
The use of Robust Lasso-Zero is showcased for variable selection with missing values in the covariates.
1 code implementation • ICML 2020 • Boris Muzellec, Julie Josse, Claire Boyer, Marco Cuturi
Missing data is a crucial issue when applying machine learning algorithms to real-world datasets.
1 code implementation • NeurIPS 2020 • Aude Sportisse, Claire Boyer, Julie Josse
Considering a data matrix generated from a probabilistic principal component analysis (PPCA) model containing several MNAR variables, not necessarily under the same self-masked missing mechanism, we propose estimators for the means, variances and covariances of the variables and study their consistency.
Statistics Theory Statistics Theory
1 code implementation • 29 Dec 2018 • Aude Sportisse, Claire Boyer, Julie Josse
Our second contribution is to suggest a computationally efficient surrogate estimation by implicitly taking into account the joint distribution of the data and the missing mechanism: the data matrix is concatenated with the mask coding for the missing values; a low-rank structure for exponential family is assumed on this new matrix, in order to encode links between variables and missing mechanisms.
no code implementations • 29 Aug 2018 • Erwan Fouillen, Claire Boyer, Maxime Sangnier
Gradient boosting is a prediction method that iteratively combines weak learners to produce a complex and accurate model.
1 code implementation • 11 Jun 2018 • Ben Adcock, Claire Boyer, Simone Brugiapaglia
We present improved sampling complexity bounds for stable and robust sparse recovery in compressed sensing.
Information Theory Information Theory