no code implementations • ICLR 2018 • Sahil Sharma, Girish Raguvir J, Srivatsan Ramesh, Balaraman Ravindran
Our second major contribution is that we propose a generalization of lambda-returns called Confidence-based Autodidactic Returns (CAR), wherein the RL agent learns the weighting of the n-step returns in an end-to-end manner.