no code implementations • 16 Mar 2024 • Ziping Xu, Kelly W. Zhang, Susan A. Murphy
In the realm of Reinforcement Learning (RL), online RL is often conceptualized as an optimization problem, where an algorithm interacts with an unknown environment to minimize cumulative regret.
no code implementations • 26 Feb 2024 • Anna L. Trella, Kelly W. Zhang, Inbal Nahum-Shani, Vivek Shetty, Iris Yan, Finale Doshi-Velez, Susan A. Murphy
This paper proposes algorithm fidelity as a critical requirement for deploying online RL algorithms in clinical trials.
1 code implementation • 15 Aug 2022 • Anna L. Trella, Kelly W. Zhang, Inbal Nahum-Shani, Vivek Shetty, Finale Doshi-Velez, Susan A. Murphy
Dental disease is one of the most common chronic diseases despite being largely preventable.
no code implementations • 30 Jul 2022 • Kelly W. Zhang, Omer Gottesman, Finale Doshi-Velez
In the reinforcement learning literature, there are many algorithms developed for either Contextual Bandit (CB) or Markov Decision Processes (MDP) environments.
1 code implementation • 8 Jun 2022 • Anna L. Trella, Kelly W. Zhang, Inbal Nahum-Shani, Vivek Shetty, Finale Doshi-Velez, Susan A. Murphy
Online reinforcement learning (RL) algorithms are increasingly used to personalize digital interventions in the fields of mobile health and online education.
no code implementations • 14 Feb 2022 • Kelly W. Zhang, Lucas Janson, Susan A. Murphy
In this work, we focus on longitudinal user data collected by a large class of adaptive sampling algorithms that are designed to optimize treatment decisions online using accruing data from multiple users.
no code implementations • NeurIPS 2021 • Kelly W. Zhang, Lucas Janson, Susan A. Murphy
Yet there is a lack of general methods for conducting statistical inference using more complex models on data collected with (contextual) bandit algorithms; for example, current methods cannot be used for valid inference on parameters in a logistic regression model for a binary reward.
no code implementations • NeurIPS 2020 • Kelly W. Zhang, Lucas Janson, Susan A. Murphy
As bandit algorithms are increasingly utilized in scientific studies and industrial applications, there is an associated increasing need for reliable inference methods based on the resulting adaptively-collected data.
no code implementations • 26 Sep 2018 • Kelly W. Zhang, Samuel R. Bowman
We find that representations from language models consistently perform best on our syntactic auxiliary prediction tasks, even when trained on relatively small amounts of data.
no code implementations • 24 May 2018 • Kelly W. Zhang, Samuel R. Bowman
There is mounting evidence that pretraining can be valuable for neural network language understanding models, but we do not yet have a clear understanding of how the choice of pretraining objective affects the type of linguistic information that models learn.