To evaluate the performance of a multi-domain goal-oriented Dialogue System (DS), it is important to understand what the users’ goals are for the conversations and whether those goals are successfully achieved.
Seq2seq language generation models that are trained offline with multiple domains in a sequential fashion often suffer from catastrophic forgetting.
Rephrase detection is used to identify the rephrases and has long been treated as a task with pairwise input, which does not fully utilize the contextual information (e. g. users’ implicit feedback).
Personalized dialogue agents (DAs) powered by large pre-trained language models (PLMs) often rely on explicit persona descriptions to maintain personality consistency.
Recent advances in cross-lingual commonsense reasoning (CSR) are facilitated by the development of multilingual pre-trained models (mPTMs).
Query Rewriting (QR) plays a critical role in large-scale dialogue systems for reducing frictions.
Conversational understanding is an integral part of modern intelligent devices.
Self-learning paradigms in large-scale conversational AI agents tend to leverage user feedback in bridging between what they say and what they mean.
Additionally, the dependency on a fixed vocabulary limits the subword models' adaptability across languages and domains.
However, these methods rarely focus on query expansion and entity weighting simultaneously, which may limit the scope and accuracy of the query reformulation retrieval.
Individual user profiles and interaction histories play a significant role in providing customized experiences in real-world applications such as chatbots, social media, retail, and education.
Text Style Transfer (TST) aims to alter the underlying style of the source text to another specific style while keeping the same content.
In this work, we go beyond the existing paradigms and propose a novel approach to generate high-quality paraphrases with weak supervision data.
Query rewriting (QR) systems are widely used to reduce the friction caused by errors in a spoken language understanding pipeline.
Spoken language understanding (SLU) systems in conversational AI agents often experience errors in the form of misrecognitions by automatic speech recognition (ASR) or semantic gaps in natural language understanding (NLU).
Then, inspired by the wide success of pre-trained contextual language embeddings, and also as a way to compensate for insufficient QR training data, we propose a language-modeling (LM) based approach to pre-train query embeddings on historical user conversation data with a voice assistant.
Typically, the accuracy of the ML models in these components are improved by manually transcribing and annotating data.
In this paper, we propose to distill the internal representations of a large model such as BERT into a simplified version of it.