no code implementations • 14 Mar 2024 • Zhishuai Liu, Pan Xu
Distributionally robust offline reinforcement learning (RL), which seeks robust policy training against environment perturbation by modeling dynamics uncertainty, calls for function approximations when facing large state-action spaces.
1 code implementation • 23 Feb 2024 • Zhishuai Liu, Pan Xu
We provide the first study on online DRMDPs with function approximation for off-dynamics RL.
no code implementations • 24 Oct 2023 • Zhen Qin, Zhishuai Liu, Pan Xu
Yet, existing analyses of signSGD rely on assuming that data are sampled with replacement in each iteration, contradicting the practical implementation where data are randomly reshuffled and sequentially fed into the algorithm.