Search Results for author: Yichun Hu

Found 5 papers, 2 papers with code

Practical Policy Optimization with Personalized Experimentation

no code implementations30 Mar 2023 Mia Garrard, Hanson Wang, Ben Letham, Shaun Singh, Abbas Kazerouni, Sarah Tan, Zehui Wang, Yin Huang, Yichun Hu, Chad Zhou, Norm Zhou, Eytan Bakshy

Many organizations measure treatment effects via an experimentation platform to evaluate the casual effect of product variations prior to full-scale deployment.

Fast Rates for the Regret of Offline Reinforcement Learning

no code implementations31 Jan 2021 Yichun Hu, Nathan Kallus, Masatoshi Uehara

Second, we provide new analyses of FQI and Bellman residual minimization to establish the correct pointwise convergence guarantees.

Decision Making reinforcement-learning +1

Fast Rates for Contextual Linear Optimization

no code implementations5 Nov 2020 Yichun Hu, Nathan Kallus, Xiaojie Mao

While one may use off-the-shelf machine learning methods to separately learn a predictive model and plug it in, a variety of recent methods instead integrate estimation and optimization by fitting the model to directly optimize downstream decision performance.

Decision Making

DTR Bandit: Learning to Make Response-Adaptive Decisions With Low Regret

1 code implementation6 May 2020 Yichun Hu, Nathan Kallus

While existing literature mostly focuses on estimating the optimal DTR from offline data such as from sequentially randomized trials, we study the problem of developing the optimal DTR in an online manner, where the interaction with each individual affect both our cumulative reward and our data collection for future learning.

Smooth Contextual Bandits: Bridging the Parametric and Non-differentiable Regret Regimes

1 code implementation5 Sep 2019 Yichun Hu, Nathan Kallus, Xiaojie Mao

We study a nonparametric contextual bandit problem where the expected reward functions belong to a H\"older class with smoothness parameter $\beta$.

Multi-Armed Bandits

Cannot find the paper you are looking for? You can Submit a new open access paper.