no code implementations • 2 May 2024 • Aurélien Bibaut, Nathan Kallus
Adaptive experiments such as multi-arm bandits adapt the treatment-allocation policy and/or the decision to stop the experiment to the data observed so far.
no code implementations • NeurIPS 2021 • Aurélien Bibaut, Antoine Chambaz, Maria Dimakopoulou, Nathan Kallus, Mark van der Laan
Empirical risk minimization (ERM) is the workhorse of machine learning, whether for classification and regression or for off-policy policy learning, but its model-agnostic guarantees can fail when we use adaptively collected data, such as the result of running a contextual bandit algorithm.
no code implementations • NeurIPS 2021 • Aurélien Bibaut, Antoine Chambaz, Maria Dimakopoulou, Nathan Kallus, Mark van der Laan
The adaptive nature of the data collected by contextual bandit algorithms, however, makes this difficult: standard estimators are no longer asymptotically normally distributed and classic confidence intervals fail to provide correct coverage.
1 code implementation • 5 Apr 2017 • Cheng Ju, Aurélien Bibaut, Mark J. Van Der Laan
In this work, we investigated multiple widely used ensemble methods, including unweighted averaging, majority voting, the Bayes Optimal Classifier, and the (discrete) Super Learner, for image recognition tasks, with deep neural networks as candidate algorithms.