no code implementations • 22 Dec 2023 • Zhong Zheng, Fengyu Gao, Lingzhou Xue, Jing Yang
In this paper, we consider federated reinforcement learning for tabular episodic Markov Decision Processes (MDP) where, under the coordination of a central server, multiple agents collaboratively explore the environment and learn an optimal policy without sharing their raw data.