no code implementations • 5 Oct 2023 • Zelai Xu, Yancheng Liang, Chao Yu, Yu Wang, Yi Wu
Alternatively, Policy-Space Response Oracles (PSRO) is an iterative framework for learning NE, where the best responses w. r. t.
no code implementations • 7 Aug 2023 • Yancheng Liang, Jiajie Zhang, Hui Li, Xiaochen Liu, Yi Hu, Yong Wu, Jinyao Zhang, Yongyan Liu, Yi Wu
Despite the tremendous advances achieved over the past years by deep learning techniques, the latest risk prediction models for industrial applications still rely on highly handtuned stage-wised statistical learning tools, such as gradient boosting and random forest methods.
no code implementations • 13 Dec 2021 • Shusheng Xu, Yancheng Liang, Yunfei Li, Simon Shaolei Du, Yi Wu
A ubiquitous requirement in many practical reinforcement learning (RL) applications, including medical treatment, recommendation system, education and robotics, is that the deployed policy that actually interacts with the environment cannot change frequently.