Search Results for author: Stéphane Gaïffas

Found 27 papers, 8 papers with code

Robust Stochastic Optimization via Gradient Quantile Clipping

no code implementations29 Sep 2023 Ibrahim Merad, Stéphane Gaïffas

For strongly convex objectives, we prove that the iteration converges to a concentrated distribution and derive high probability bounds on the final estimation error.

Stochastic Optimization

Convergence and concentration properties of constant step-size SGD through Markov chains

no code implementations20 Jun 2023 Ibrahim Merad, Stéphane Gaïffas

We consider the optimization of a smooth and strongly convex objective using constant step-size stochastic gradient descent (SGD) and study its properties through the prism of Markov chains.

Robust Methods for High-Dimensional Linear Learning

no code implementations10 Aug 2022 Ibrahim Merad, Stéphane Gaïffas

We propose statistically robust and computationally efficient linear learning methods in the high-dimensional batch setting, where the number of features $d$ may exceed the sample size $n$.

Vocal Bursts Intensity Prediction

Robust supervised learning with coordinate gradient descent

1 code implementation31 Jan 2022 Stéphane Gaïffas, Ibrahim Merad

This paper considers the problem of supervised learning with linear methods when both features and labels can be corrupted, either in the form of heavy tailed data and/or corrupted rows.

WildWood: a new Random Forest algorithm

1 code implementation16 Sep 2021 Stéphane Gaïffas, Ibrahim Merad, Yiyang Yu

We introduce WildWood (WW), a new ensemble algorithm for supervised learning of Random Forest (RF) type.

An improper estimator with optimal excess risk in misspecified density estimation and logistic regression

no code implementations23 Dec 2019 Jaouad Mourtada, Stéphane Gaïffas

On standard examples, this bound scales as $d/n$ with $d$ the model dimension and $n$ the sample size, and critically remains valid under model misspecification.

Density Estimation regression +1

ZiMM: a deep learning model for long term and blurry relapses with non-clinical claims data

no code implementations13 Nov 2019 Anastasiia Kabeshova, Yiyang Yu, Bertrand Lukacs, Emmanuel Bacry, Stéphane Gaïffas

We consider a long-term (18 months) relapse (urination problems still occur despite surgery), which is blurry since it is observed only through the reimbursement of a specific set of drugs for urination problems.

SCALPEL3: a scalable open-source library for healthcare claims databases

3 code implementations15 Oct 2019 Emmanuel Bacry, Stéphane Gaïffas, Fanny Leroy, Maryan Morel, Dinh Phong Nguyen, Youcef Sebiat, Dian Sun

SCALPEL-Extraction provides fast concept extraction from a big table such as the one produced by SCALPEL-Flattening.

Distributed, Parallel, and Cluster Computing Computers and Society

AMF: Aggregated Mondrian Forests for Online Learning

2 code implementations25 Jun 2019 Jaouad Mourtada, Stéphane Gaïffas, Erwan Scornet

Using a variant of the Context Tree Weighting algorithm, we show that it is possible to efficiently perform an exact aggregation over all prunings of the trees; in particular, this enables to obtain a truly online parameter-free algorithm which is competitive with the optimal pruning of the Mondrian tree, and thus adaptive to the unknown regularity of the regression function.

General Classification Multi-class Classification +1

On the optimality of the Hedge algorithm in the stochastic regime

no code implementations5 Sep 2018 Jaouad Mourtada, Stéphane Gaïffas

Moreover, our analysis exhibits qualitative differences with other variants of the Hedge algorithm, such as the fixed-horizon version (with constant learning rate) and the one based on the so-called "doubling trick", both of which fail to adapt to the easier stochastic setting.

Dual optimization for convex constrained objectives without the gradient-Lipschitz assumption

no code implementations10 Jul 2018 Martin Bompaire, Emmanuel Bacry, Stéphane Gaïffas

The minimization of convex objectives coming from linear supervised learning problems, such as penalized generalized linear models, can be formulated as finite sums of convex functions.

Computational Efficiency regression

Minimax optimal rates for Mondrian trees and forests

no code implementations15 Mar 2018 Jaouad Mourtada, Stéphane Gaïffas, Erwan Scornet

Our results include consistency and convergence rates for Mondrian Trees and Forests, that turn out to be minimax optimal on the set of $s$-H\"older function with $s \in (0, 1]$ (for trees and forests) and $s \in (1, 2]$ (for forests only), assuming a proper tuning of their complexity parameter in both cases.

ConvSCCS: convolutional self-controlled case series model for lagged adverse event detection

1 code implementation21 Dec 2017 Maryan Morel, Emmanuel Bacry, Stéphane Gaïffas, Agathe Guilloux, Fanny Leroy

With the increased availability of large databases of electronic health records (EHRs) comes the chance of enhancing health risks screening.

Event Detection Marketing

High-dimensional robust regression and outliers detection with SLOPE

no code implementations7 Dec 2017 Alain Virouleau, Agathe Guilloux, Stéphane Gaïffas, Malgorzata Bogdan

Following a recent set of works providing methods for simultaneous robust regression and outliers detection, we consider in this paper a model of linear regression with individual intercepts, in a high-dimensional setting.

regression Vocal Bursts Intensity Prediction

Universal consistency and minimax rates for online Mondrian Forests

no code implementations NeurIPS 2017 Jaouad Mourtada, Stéphane Gaïffas, Erwan Scornet

We establish the consistency of an algorithm of Mondrian Forests, a randomized classification algorithm that can be implemented online.

General Classification regression

Sparse inference of the drift of a high-dimensional Ornstein-Uhlenbeck process

no code implementations10 Jul 2017 Stéphane Gaïffas, Gustaw Matulewicz

As a by-product, we point out the fact that for the Ornstein-Uhlenbeck process, one does not need an assumption of restricted eigenvalue type in order to derive fast rates for the Lasso, while it is well-known to be mandatory for linear regression for instance.

Variable Selection

Tick: a Python library for statistical learning, with a particular emphasis on time-dependent modelling

2 code implementations10 Jul 2017 Emmanuel Bacry, Martin Bompaire, Stéphane Gaïffas, Soren Poulsen

Tick is a statistical learning library for Python~3, with a particular emphasis on time-dependent models, such as point processes, and tools for generalized linear models and survival analysis.

Point Processes Survival Analysis

Binarsity: a penalization for one-hot encoded features in linear supervised learning

no code implementations24 Mar 2017 Mokhtar Z. Alaya, Simon Bussy, Stéphane Gaïffas, Agathe Guilloux

In each group of binary features coming from the one-hot encoding of a single raw continuous feature, this penalization uses total-variation regularization together with an extra linear constraint.

C-mix: a high dimensional mixture model for censored durations, with applications to genetic data

1 code implementation24 Oct 2016 Simon Bussy, Agathe Guilloux, Stéphane Gaïffas, Anne-Sophie Jannot

We introduce a mixture model for censored durations (C-mix), and develop maximum likelihood inference for the joint estimation of the time distributions and latent regression parameters of the model.

Uncovering Causality from Multivariate Hawkes Integrated Cumulants

1 code implementation ICML 2017 Massil Achab, Emmanuel Bacry, Stéphane Gaïffas, Iacopo Mastromatteo, Jean-Francois Muzy

We design a new nonparametric method that allows one to estimate the matrix of integrated kernels of a multivariate Hawkes process.

Mean-field inference of Hawkes point processes

no code implementations4 Nov 2015 Emmanuel Bacry, Stéphane Gaïffas, Iacopo Mastromatteo, Jean-François Muzy

We propose a fast and efficient estimation method that is able to accurately recover the parameters of a d-dimensional Hawkes point-process from a set of observations.

Point Processes valid

SGD with Variance Reduction beyond Empirical Risk Minimization

no code implementations16 Oct 2015 Massil Achab, Agathe Guilloux, Stéphane Gaïffas, Emmanuel Bacry

We introduce a doubly stochastic proximal gradient algorithm for optimizing a finite average of smooth convex functions, whose gradients depend on numerically expensive expectations.

Survival Analysis

Learning the intensity of time events with change-points

no code implementations2 Jul 2015 Mokhtar Zahdi Alaya, Stéphane Gaïffas, Agathe Guilloux

We prove that this leads to a sharp tuning of the convex relaxation of the segmentation prior, by stating oracle inequalities with fast rates of convergence, and consistency for change-points detection.

Segmentation

Sparse and low-rank multivariate Hawkes processes

no code implementations4 Jan 2015 Emmanuel Bacry, Martin Bompaire, Stéphane Gaïffas, Jean-François Muzy

We consider the problem of unveiling the implicit network structure of node interactions (such as user interactions in a social network), based only on high-frequency timestamps.

Concentration for matrix martingales in continuous time and microscopic activity of social networks

no code implementations24 Dec 2014 Emmanuel Bacry, Stéphane Gaïffas, Jean-François Muzy

This paper gives new concentration inequalities for the spectral norm of a wide class of matrix martingales in continuous time.

Cannot find the paper you are looking for? You can Submit a new open access paper.