no code implementations • 8 Sep 2021 • Ziyi Chen, Yi Zhou, Rongrong Chen, Shaofeng Zou
Actor-critic (AC) algorithms have been widely adopted in decentralized multi-agent systems to learn the optimal joint control policy.
no code implementations • 24 Mar 2021 • Ziyi Chen, Yi Zhou, Rongrong Chen
Under Markovian sampling and linear function approximation, we proved that the finite-time sample complexity of both algorithms for achieving an $\epsilon$-accurate solution is in the order of $\mathcal{O}(\epsilon^{-1}\ln \epsilon^{-1})$, matching the near-optimal sample complexity of centralized TD(0) and TDC.