no code implementations • 27 Nov 2022 • Alan Clark, Shoaib Ahmed Siddiqui, Robert Kirk, Usman Anwar, Stephen Chung, David Krueger
Existing offline reinforcement learning (RL) algorithms typically assume that training data is either: 1) generated by a known policy, or 2) of entirely unknown origin.