1 code implementation • ICLR 2020 • Benjamin James Lansdell, Prashanth Ravi Prakash, Konrad Paul Kording
We provide proof that our approach converges to the true gradient for certain classes of networks.
Reinforcement Learning (RL)