Search Results for author: Kelly W. Zhang

Found 10 papers, 2 papers with code

The Fallacy of Minimizing Local Regret in the Sequential Task Setting

no code implementations16 Mar 2024 Ziping Xu, Kelly W. Zhang, Susan A. Murphy

In the realm of Reinforcement Learning (RL), online RL is often conceptualized as an optimization problem, where an algorithm interacts with an unknown environment to minimize cumulative regret.

Reinforcement Learning (RL)

A Bayesian Approach to Learning Bandit Structure in Markov Decision Processes

no code implementations30 Jul 2022 Kelly W. Zhang, Omer Gottesman, Finale Doshi-Velez

In the reinforcement learning literature, there are many algorithms developed for either Contextual Bandit (CB) or Markov Decision Processes (MDP) environments.

Decision Making reinforcement-learning +1

Designing Reinforcement Learning Algorithms for Digital Interventions: Pre-implementation Guidelines

1 code implementation8 Jun 2022 Anna L. Trella, Kelly W. Zhang, Inbal Nahum-Shani, Vivek Shetty, Finale Doshi-Velez, Susan A. Murphy

Online reinforcement learning (RL) algorithms are increasingly used to personalize digital interventions in the fields of mobile health and online education.

reinforcement-learning Reinforcement Learning (RL)

Statistical Inference After Adaptive Sampling for Longitudinal Data

no code implementations14 Feb 2022 Kelly W. Zhang, Lucas Janson, Susan A. Murphy

In this work, we focus on longitudinal user data collected by a large class of adaptive sampling algorithms that are designed to optimize treatment decisions online using accruing data from multiple users.

reinforcement-learning Reinforcement Learning (RL)

Statistical Inference with M-Estimators on Adaptively Collected Data

no code implementations NeurIPS 2021 Kelly W. Zhang, Lucas Janson, Susan A. Murphy

Yet there is a lack of general methods for conducting statistical inference using more complex models on data collected with (contextual) bandit algorithms; for example, current methods cannot be used for valid inference on parameters in a logistic regression model for a binary reward.

Decision Making Multi-Armed Bandits +1

Inference for Batched Bandits

no code implementations NeurIPS 2020 Kelly W. Zhang, Lucas Janson, Susan A. Murphy

As bandit algorithms are increasingly utilized in scientific studies and industrial applications, there is an associated increasing need for reliable inference methods based on the resulting adaptively-collected data.

Multi-Armed Bandits

Language Modeling Teaches You More Syntax than Translation Does: Lessons Learned Through Auxiliary Task Analysis

no code implementations26 Sep 2018 Kelly W. Zhang, Samuel R. Bowman

We find that representations from language models consistently perform best on our syntactic auxiliary prediction tasks, even when trained on relatively small amounts of data.

Language Modelling Transfer Learning +1

Language Modeling Teaches You More than Translation Does: Lessons Learned Through Auxiliary Task Analysis

no code implementations24 May 2018 Kelly W. Zhang, Samuel R. Bowman

There is mounting evidence that pretraining can be valuable for neural network language understanding models, but we do not yet have a clear understanding of how the choice of pretraining objective affects the type of linguistic information that models learn.

Language Modelling Transfer Learning +1

Cannot find the paper you are looking for? You can Submit a new open access paper.