Sequence-to-sequence neural network models for generation of conversational responses tend to generate safe, commonplace responses (e. g., "I don't know") regardless of the input.
Pre-training and fine-tuning, e. g., BERT, have achieved great success in language understanding by transferring knowledge from rich-resource pre-training task to the low/zero-resource downstream tasks.
We present a large, tunable neural conversational response generation model, DialoGPT (dialogue generative pre-trained transformer).
Responses generated by neural conversational models tend to lack informativeness and diversity.
In this work, we propose a memory-augmented generative model, which learns to abstract from the training corpus and saves the useful information to the memory to assist the response generation.