no code implementations • 16 Oct 2020 • Teng Long, Qing-Shan Jia
The transition to the zero-carbon power system is underway accelerating recently.
no code implementations • 31 Oct 2021 • Kuo Li, Qing-Shan Jia
Furthermore, convergence analysis is given under the discrete-space case, which guarantees that the policy will be reinforced by alternating between the processes of policy evaluation and policy improvement.
no code implementations • 31 Oct 2021 • Kuo Li, Qing-Shan Jia, Jiaqi Yan
We formulate the sampling process as a policy searching problem and give a solution from the perspective of Reinforcement Learning (RL).
no code implementations • 22 Aug 2022 • Qilong Huang, Qing-Shan Jia, Xiang Wu, Shengyuan Xu, Xiaohong Guan
First, a joint scheduling model of pricing and charging control is developed to maximize the expected social welfare of the charging station considering the Quality of Service and the price fluctuation sensitivity of EV drivers.
no code implementations • 13 Dec 2022 • Shuang Wu, Xiaoqiang Ren, Qing-Shan Jia, Karl Henrik Johansson, Ling Shi
To alleviate the challenge, we reformulate the problem as a variant of the restless multi-armed bandit (RMAB) problem and leverage Whittle's index theory to design an index-based scheduling policy algorithm.
1 code implementation • 3 Feb 2023 • Jianxiong Li, Xiao Hu, Haoran Xu, Jingjing Liu, Xianyuan Zhan, Qing-Shan Jia, Ya-Qin Zhang
RGM is formulated as a bi-level optimization problem: the upper layer optimizes a reward correction term that performs visitation distribution matching w. r. t.
no code implementations • 27 May 2023 • Xiao Hu, Jianxiong Li, Xianyuan Zhan, Qing-Shan Jia, Ya-Qin Zhang
To unravel this mystery, we identify a long-neglected issue in the query selection schemes of existing PbRL studies: Query-Policy Misalignment.