1 code implementation • 15 Sep 2021 • Daniel Seita, Abhinav Gopal, Zhao Mandi, John Canny
Then, students learn by running either offline RL or by using teacher data in combination with a small amount of self-generated data.
Offline RL reinforcement-learning +1