Search Results for author: Aurélien F. Bibaut

Found 3 papers, 0 papers with code

Rate-adaptive model selection over a collection of black-box contextual bandit algorithms

no code implementations • 5 Jun 2020 • Aurélien F. Bibaut, Antoine Chambaz, Mark J. Van Der Laan

To the best of our knowledge, our proposal is the first one to be rate-adaptive for a collection of general black-box contextual bandit algorithms: it achieves the same regret rate as the best candidate.

Model Selection

Paper
Add Code

Generalized Policy Elimination: an efficient algorithm for Nonparametric Contextual Bandits

no code implementations • 5 Mar 2020 • Aurélien F. Bibaut, Antoine Chambaz, Mark J. Van Der Laan

We propose the Generalized Policy Elimination (GPE) algorithm, an oracle-efficient contextual bandit (CB) algorithm inspired by the Policy Elimination algorithm of \cite{dudik2011}.

Multi-Armed Bandits

Paper
Add Code

More Efficient Off-Policy Evaluation through Regularized Targeted Learning

no code implementations • 13 Dec 2019 • Aurélien F. Bibaut, Ivana Malenica, Nikos Vlassis, Mark J. Van Der Laan

We study the problem of off-policy evaluation (OPE) in Reinforcement Learning (RL), where the aim is to estimate the performance of a new policy given historical data that may have been generated by a different policy, or policies.

Causal Inference Off-policy evaluation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.