Search Results for author: Zhengling Qi

Found 25 papers, 3 papers with code

Distributional Off-policy Evaluation with Bellman Residual Minimization

no code implementations • 2 Feb 2024 • Sungee Hong, Zhengling Qi, Raymond K. W. Wong

We consider the problem of distributional off-policy evaluation which serves as the foundation of many distributional reinforcement learning (DRL) algorithms.

Distributional Reinforcement Learning Off-policy evaluation

Paper
Add Code

Robust Offline Reinforcement learning with Heavy-Tailed Rewards

1 code implementation • 28 Oct 2023 • Jin Zhu, Runzhe Wan, Zhengling Qi, Shikai Luo, Chengchun Shi

This paper endeavors to augment the robustness of offline reinforcement learning (RL) in scenarios laden with heavy-tailed rewards, a prevalent circumstance in real-world applications.

Offline RL Off-policy evaluation +1

Paper
Code

Off-policy Evaluation in Doubly Inhomogeneous Environments

no code implementations • 14 Jun 2023 • Zeyu Bian, Chengchun Shi, Zhengling Qi, Lan Wang

This work aims to study off-policy evaluation (OPE) under scenarios where two key reinforcement learning (RL) assumptions -- temporal stationarity and individual homogeneity are both violated.

Offline RL Off-policy evaluation

Paper
Add Code

A Policy Gradient Method for Confounded POMDPs

no code implementations • 26 May 2023 • Mao Hong, Zhengling Qi, Yanxun Xu

To the best of our knowledge, this is the first work studying the policy gradient method for POMDPs under the offline setting.

Paper
Add Code

Sequential Knockoffs for Variable Selection in Reinforcement Learning

no code implementations • 24 Mar 2023 • Tao Ma, Hengrui Cai, Zhengling Qi, Chengchun Shi, Eric B. Laber

In real-world applications of reinforcement learning, it is often challenging to obtain a state representation that is parsimonious and satisfies the Markov property without prior knowledge.

reinforcement-learning Variable Selection

Paper
Add Code

Personalized Pricing with Invalid Instrumental Variables: Identification, Estimation, and Policy Learning

no code implementations • 24 Feb 2023 • Rui Miao, Zhengling Qi, Cong Shi, Lin Lin

Specifically, relying on the structural models of revenue and price, we establish the identifiability condition of an optimal pricing strategy under endogeneity with the help of invalid instrumental variables.

Causal Inference Econometrics

Paper
Add Code

PASTA: Pessimistic Assortment Optimization

no code implementations • 8 Feb 2023 • Juncheng Dong, Weibin Mo, Zhengling Qi, Cong Shi, Ethan X. Fang, Vahid Tarokh

The objective is to use the offline dataset to find an optimal assortment.

Paper
Add Code

STEEL: Singularity-aware Reinforcement Learning

no code implementations • 30 Jan 2023 • Xiaohong Chen, Zhengling Qi, Runzhe Wan

Batch reinforcement learning (RL) aims at leveraging pre-collected data to find an optimal policy that maximizes the expected total rewards in a dynamic environment.

Off-policy evaluation reinforcement-learning

Paper
Add Code

Value Enhancement of Reinforcement Learning via Efficient and Robust Trust Region Optimization

no code implementations • 5 Jan 2023 • Chengchun Shi, Zhengling Qi, Jianing Wang, Fan Zhou

When the initial policy is consistent, under some mild conditions, our method will yield a policy whose value converges to the optimal one at a faster rate than the initial policy, achieving the desired ``value enhancement" property.

Decision Making reinforcement-learning +1

Paper
Add Code

Offline Reinforcement Learning for Human-Guided Human-Machine Interaction with Private Information

no code implementations • 23 Dec 2022 • Zuyue Fu, Zhengling Qi, Zhuoran Yang, Zhaoran Wang, Lan Wang

To tackle the distributional mismatch, we leverage the idea of pessimism and use our OPE method to develop an off-policy learning algorithm for finding a desirable policy pair for both Alice and Bob.

Decision Making Off-policy evaluation +1

Paper
Add Code

RISE: Robust Individualized Decision Learning with Sensitive Variables

1 code implementation • 12 Nov 2022 • Xiaoqing Tan, Zhengling Qi, Christopher W. Seymour, Lu Tang

This paper introduces RISE, a robust individualized decision learning framework with sensitive variables, where sensitive variables are collectible data and important to the intervention decision, but their inclusion in decision making is prohibited due to reasons such as delayed availability or fairness concerns.

Decision Making Fairness

Paper
Code

Optimizing Pessimism in Dynamic Treatment Regimes: A Bayesian Learning Approach

1 code implementation • 26 Oct 2022 • Yunzhe Zhou, Zhengling Qi, Chengchun Shi, Lexin Li

In this article, we propose a novel pessimism-based Bayesian learning method for optimal dynamic treatment regimes in the offline setting.

Thompson Sampling Variational Inference

Paper
Code

Blessing from Human-AI Interaction: Super Reinforcement Learning in Confounded Environments

no code implementations • 29 Sep 2022 • Jiayi Wang, Zhengling Qi, Chengchun Shi

This approach utilizes the observed action, either from AI or humans, as input for achieving a stronger oracle in policy learning for the decision maker (humans or AI).

Decision Making reinforcement-learning +1

Paper
Add Code

Off-Policy Evaluation for Episodic Partially Observable Markov Decision Processes under Non-Parametric Models

no code implementations • 21 Sep 2022 • Rui Miao, Zhengling Qi, Xiaoke Zhang

We study the problem of off-policy evaluation (OPE) for episodic Partially Observable Markov Decision Processes (POMDPs) with continuous states.

Causal Inference Off-policy evaluation

Paper
Add Code

Offline Reinforcement Learning with Instrumental Variables in Confounded Markov Decision Processes

no code implementations • 18 Sep 2022 • Zuyue Fu, Zhengling Qi, Zhaoran Wang, Zhuoran Yang, Yanxun Xu, Michael R. Kosorok

Due to the lack of online interaction with the environment, offline RL is facing the following two significant challenges: (i) the agent may be confounded by the unobserved state variables; (ii) the offline data collected a prior does not provide sufficient coverage for the environment.

Offline RL reinforcement-learning +1

Paper
Add Code

On Well-posedness and Minimax Optimal Rates of Nonparametric Q-function Estimation in Off-policy Evaluation

no code implementations • 17 Jan 2022 • Xiaohong Chen, Zhengling Qi

We study the off-policy evaluation (OPE) problem in an infinite-horizon Markov decision process with continuous states and actions.

Off-policy evaluation

Paper
Add Code

Pessimistic Model Selection for Offline Deep Reinforcement Learning

no code implementations • 29 Nov 2021 • Chao-Han Huck Yang, Zhengling Qi, Yifan Cui, Pin-Yu Chen

Deep Reinforcement Learning (DRL) has demonstrated great potentials in solving sequential decision making problems in many applications.

Decision Making Model Selection +2

Paper
Add Code

Rejoinder: Learning Optimal Distributionally Robust Individualized Treatment Rules

no code implementations • 17 Oct 2021 • Weibin Mo, Zhengling Qi, Yufeng Liu

However, when the growth of testing sample size available for training is in a slower order, efficient value function estimates may not perform well anymore.

Paper
Add Code

Projected State-action Balancing Weights for Offline Reinforcement Learning

no code implementations • 10 Sep 2021 • Jiayi Wang, Zhengling Qi, Raymond K. W. Wong

Offline policy evaluation (OPE) is considered a fundamental and challenging problem in reinforcement learning (RL).

Causal Inference reinforcement-learning +1

Paper
Add Code

Proximal Learning for Individualized Treatment Regimes Under Unmeasured Confounding

no code implementations • 3 May 2021 • Zhengling Qi, Rui Miao, Xiaoke Zhang

Data-driven individualized decision making has recently received increasing research interests.

Causal Inference Decision Making

Paper
Add Code

Robust Batch Policy Learning in Markov Decision Processes

no code implementations • 9 Nov 2020 • Zhengling Qi, Peng Liao

We study the offline data-driven sequential decision making problem in the framework of Markov decision process (MDP).

Decision Making

Paper
Add Code

Batch Policy Learning in Average Reward Markov Decision Processes

no code implementations • 23 Jul 2020 • Peng Liao, Zhengling Qi, Runzhe Wan, Predrag Klasnja, Susan Murphy

The performance of the method is illustrated by simulation studies and an analysis of a mobile health study promoting physical activity.

Paper
Add Code

Learning Optimal Distributionally Robust Individualized Treatment Rules

no code implementations • 26 Jun 2020 • Weibin Mo, Zhengling Qi, Yufeng Liu

We propose a novel distributionally robust ITR (DR-ITR) framework that maximizes the worst-case value function across the values under a set of underlying distributions that are "close" to the training distribution.

Decision Making

Paper
Add Code

Statistical Analysis of Stationary Solutions of Coupled Nonconvex Nonsmooth Empirical Risk Minimization

no code implementations • 6 Oct 2019 • Zhengling Qi, Ying Cui, Yufeng Liu, Jong-Shi Pang

This paper has two main goals: (a) establish several statistical properties---consistency, asymptotic distributions, and convergence rates---of stationary solutions and values of a class of coupled nonconvex and nonsmoothempirical risk minimization problems, and (b) validate these properties by a noisy amplitude-based phase retrieval problem, the latter being of much topical interest. Derived from available data via sampling, these empirical risk minimization problems are the computational workhorse of a population risk model which involves the minimization of an expected value of a random functional.

Retrieval

Paper
Add Code

Estimation of Individualized Decision Rules Based on an Optimized Covariate-Dependent Equivalent of Random Outcomes

no code implementations • 27 Aug 2019 • Zhengling Qi, Ying Cui, Yufeng Liu, Jong-Shi Pang

Recent exploration of optimal individualized decision rules (IDRs) for patients in precision medicine has attracted a lot of attention due to the heterogeneous responses of patients to different treatments.

Decision Making

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.