Task-Completion Dialogue Policy Learning
3 papers with code • 0 benchmarks • 0 datasets
Benchmarks
These leaderboards are used to track progress in Task-Completion Dialogue Policy Learning
Latest papers
Switch-based Active Deep Dyna-Q: Efficient Adaptive Planning for Task-Completion Dialogue Policy Learning
Training task-completion dialogue agents with reinforcement learning usually requires a large number of real user experiences.
Discriminative Deep Dyna-Q: Robust Planning for Dialogue Policy Learning
This paper presents a Discriminative Deep Dyna-Q (D3Q) approach to improving the effectiveness and robustness of Deep Dyna-Q (DDQ), a recently proposed framework that extends the Dyna-Q algorithm to integrate planning for task-completion dialogue policy learning.
Deep Dyna-Q: Integrating Planning for Task-Completion Dialogue Policy Learning
During dialogue policy learning, the world model is constantly updated with real user experience to approach real user behavior, and in turn, the dialogue agent is optimized using both real experience and simulated experience.