no code implementations • 16 Oct 2023 • Thomas Jiralerspong, Flemming Kondrup, Doina Precup, Khimya Khetarpal
The ability to plan at many different levels of abstraction enables agents to envision the long-term repercussions of their decisions and thus enables sample-efficient learning.
1 code implementation • 5 Oct 2022 • Flemming Kondrup, Thomas Jiralerspong, Elaine Lau, Nathan de Lara, Jacob Shkrob, My Duc Tran, Doina Precup, Sumana Basu
We design a clinically relevant intermediate reward that encourages continuous improvement of the patient vitals as well as addresses the challenge of sparse reward in RL.