no code implementations • 5 Mar 2024 • Yingrong Wang, Anpeng Wu, Haoxuan Li, Weiming Liu, Qiaowei Miao, Ruoxuan Xiong, Fei Wu, Kun Kuang
This paper focuses on developing Pareto-optimal estimation and policy learning to identify the most effective treatment that maximizes the total reward from both short-term and long-term effects, which might conflict with each other.