Search Results for author: Joshua Uyheng

Found 1 papers, 1 papers with code

Reliability and Learnability of Human Bandit Feedback for Sequence-to-Sequence Reinforcement Learning

1 code implementation ACL 2018 Julia Kreutzer, Joshua Uyheng, Stefan Riezler

We present a study on reinforcement learning (RL) from human bandit feedback for sequence-to-sequence learning, exemplified by the task of bandit neural machine translation (NMT).

Machine Translation NMT +3

Cannot find the paper you are looking for? You can Submit a new open access paper.