no code implementations • 9 Mar 2024 • Rong J. B. Zhu, Weiwei Jiang
Furthermore, we utilize the P-spline method to approximate the nonparametric function and develop procedures for inferring treatment effects within this framework.
no code implementations • 5 Jun 2023 • Rong J. B. Zhu
To determine the weights in the synthesis procedure, we propose an approach that utilizes a criterion of unbiased risk estimator.
no code implementations • 10 Sep 2022 • Rong J. B. Zhu, James M. Murray
Off-policy algorithms, in which a behavior policy differs from the target policy and is used to gain experience for learning, have proven to be of great practical value in reinforcement learning.