Dialogue Learning with Human Teaching and Feedback in End-to-End Trainable Task-Oriented Dialogue Systems

HLT 2018 Bing Liu • Gokhan Tur • Dilek Hakkani-Tur • Pararth Shah • Larry Heck

Popular methods for learning task-oriented dialogues include applying reinforcement learning with user feedback on supervised pre-training models. Efficiency of such learning method may suffer from the mismatch of dialogue state distribution between offline training and online interactive learning stages. To address this challenge, we propose a hybrid imitation and reinforcement learning method, with which a dialogue agent can effectively learn from its interaction with users by learning from human teaching and feedback.

Full paper

Evaluation


Task Dataset Model Metric name Metric value Global rank Compare
Dialogue State Tracking Second dialogue state tracking challenge Liu et al. Request - # 3
Dialogue State Tracking Second dialogue state tracking challenge Liu et al. Area 90 # 3
Dialogue State Tracking Second dialogue state tracking challenge Liu et al. Food 84 # 3
Dialogue State Tracking Second dialogue state tracking challenge Liu et al. Price 92 # 3
Dialogue State Tracking Second dialogue state tracking challenge Liu et al. Joint 72 # 3