no code implementations • 21 Jul 2022 • Benjamin Howson, Ciara Pike-Burke, Sarah Filippi
However, the stringent requirement for immediate rewards is unmet in many real-world applications where the reward is almost always delayed.
no code implementations • 15 Nov 2021 • Benjamin Howson, Ciara Pike-Burke, Sarah Filippi
In this paper, we study the impact of delayed feedback in episodic reinforcement learning from a theoretical perspective and propose two general-purpose approaches to handling the delays.