no code implementations • 1 Dec 2022 • Vanamala Venkataswamy, Jake Grigsby, Andrew Grimshaw, Yanjun Qi
We utilize Offline RL as a launchpad to learn effective scheduling policies from prior experience collected using Oracle or heuristic policies.
no code implementations • 10 Nov 2022 • Vanamala Venkataswamy, Jake Grigsby, Andrew Grimshaw, Yanjun Qi
Finally, we demonstrate that the DRL scheduler can learn from and improve upon existing heuristic policies using Offline Learning.