no code implementations • 10 Feb 2017 • Kirthevasan Kandasamy, Yoram Bachrach, Ryota Tomioka, Daniel Tarlow, David Carter
We study reinforcement learning of chatbots with recurrent neural network architectures when the rewards are noisy and expensive to obtain.