1 code implementation • 19 Nov 2022 • Christopher Beckham, Alexandre Piche, David Vazquez, Christopher Pal
Measuring the mean reward of generated candidates over this approximation is one such `validation metric', whereas we are interested in a more fundamental question which is finding which validation metrics correlate the most with the ground truth.
no code implementations • 21 Oct 2022 • Alexandre Piche, Valentin Thomas, Joseph Marino, Rafael Pardinas, Gian Maria Marconi, Christopher Pal, Mohammad Emtiyaz Khan
However, learning the value function via bootstrapping often leads to unstable training due to fast-changing target values.
no code implementations • 21 Oct 2022 • Alexandre Piche, Rafael Pardinas, David Vazquez, Igor Mordatch, Chris Pal
Despite the benefits of using implicit models to learn robotic skills via BC, offline RL via Supervised Learning algorithms have been limited to explicit models.
no code implementations • ICLR 2019 • Alexandre Piche, Valentin Thomas, Cyril Ibrahim, Yoshua Bengio, Chris Pal
In this work, we propose a novel formulation of planning which views it as a probabilistic inference problem over future optimal trajectories.