Search Results for author: Runzhe Wan

Found 13 papers, 3 papers with code

Multiplier Bootstrap-based Exploration

no code implementations3 Feb 2023 Runzhe Wan, Haoyu Wei, Branislav Kveton, Rui Song

Despite the great interest in the bandit problem, designing efficient algorithms for complex models remains challenging, as there is typically no analytical way to quantify uncertainty.

Multi-Armed Bandits

STEEL: Singularity-aware Reinforcement Learning

no code implementations30 Jan 2023 Xiaohong Chen, Zhengling Qi, Runzhe Wan

In this paper, we propose a new batch RL algorithm without requiring absolute continuity in the setting of an infinite-horizon Markov decision process with continuous states and actions.

Off-policy evaluation reinforcement-learning

Heterogeneous Synthetic Learner for Panel Data

no code implementations30 Dec 2022 Ye Shen, Runzhe Wan, Hengrui Cai, Rui Song

In the new era of personalization, learning the heterogeneous treatment effect (HTE) becomes an inevitable trend with numerous applications.

Mining the Factor Zoo: Estimation of Latent Factor Models with Sufficient Proxies

no code implementations25 Dec 2022 Runzhe Wan, YingYing Li, Wenbin Lu, Rui Song

Latent factor model estimation typically relies on either using domain knowledge to manually pick several observed covariates as factor proxies, or purely conducting multivariate analysis such as principal component analysis.


Safe Exploration for Efficient Policy Evaluation and Comparison

no code implementations26 Feb 2022 Runzhe Wan, Branislav Kveton, Rui Song

High-quality data plays a central role in ensuring the accuracy of policy evaluation.

Safe Exploration

Towards Scalable and Robust Structured Bandits: A Meta-Learning Framework

no code implementations26 Feb 2022 Runzhe Wan, Lin Ge, Rui Song

In this paper, we propose a unified meta-learning framework for a general class of structured bandit problems where the parameter space can be factorized to item-level.

Meta-Learning Thompson Sampling

A Multi-Agent Reinforcement Learning Framework for Off-Policy Evaluation in Two-sided Markets

1 code implementation21 Feb 2022 Chengchun Shi, Runzhe Wan, Ge Song, Shikai Luo, Rui Song, Hongtu Zhu

In this paper we consider large-scale fleet management in ride-sharing companies that involve multiple units in different areas receiving sequences of products (or treatments) over time.

Management Multi-agent Reinforcement Learning +1

Deeply-Debiased Off-Policy Interval Estimation

1 code implementation10 May 2021 Chengchun Shi, Runzhe Wan, Victor Chernozhukov, Rui Song

Off-policy evaluation learns a target policy's value with a historical dataset generated by a different behavior policy.

Off-policy evaluation

Batch Policy Learning in Average Reward Markov Decision Processes

no code implementations23 Jul 2020 Peng Liao, Zhengling Qi, Runzhe Wan, Predrag Klasnja, Susan Murphy

The performance of the method is illustrated by simulation studies and an analysis of a mobile health study promoting physical activity.

Cannot find the paper you are looking for? You can Submit a new open access paper.