(2) How to cohere with context and preserve the knowledge when generating a stylized response.
The conversational recommender systems (CRSs) have received extensive attention in recent years.
To address these challenges, we present HeterMPC, a heterogeneous graph-based neural network for response generation in MPCs which models the semantics of utterances and interlocutors simultaneously with two types of nodes in a graph.
To address the problem, we propose augmenting TExt Generation via Task-specific and Open-world Knowledge (TegTok) in a unified framework.
This paper focuses on the Data Augmentation for low-resource Natural Language Understanding (NLU) tasks.
The sequence representation plays a key role in the learning of matching degree between the dialogue context and the response.
In such a low-resource setting, we devise a novel conversational agent, Divter, in order to isolate parameters that depend on multimodal dialogues from the entire generation model.
Furthermore, we propose to evaluate the CRS models in an end-to-end manner, which can reflect the overall performance of the entire system rather than the performance of individual modules, compared to the separate evaluations of the two modules used in previous work.
Second, only the items mentioned in the training corpus have a chance to be recommended in the conversation.
We study the problem of coarse-grained response selection in retrieval-based dialogue systems.
In recent years, world business in online discussions and opinion sharing on social media is booming.
Sequence-to-Sequence (S2S) neural text generation models, especially the pre-trained ones (e. g., BART and T5), have exhibited compelling performance on various natural language generation tasks.
The retriever aims to retrieve a correlated image to the dialog from an image index, while the visual concept detector extracts rich visual knowledge from the image.
While various forms of models are proposed for the link prediction task, most of them are designed based on a few known relation patterns in several well-known datasets.
In this paper, we formalize the music-conditioned dance generation as a sequence-to-sequence learning problem and devise a novel seq2seq architecture to efficiently process long sequences of music features and capture the fine-grained correspondence between music and dance.
Ranked #1 on Motion Synthesis on BRACE
Thus, we propose learning a response generation model with both image-grounded dialogues and textual dialogues by assuming that the visual scene information at the time of a conversation can be represented by an image, and trying to recover the latent images of the textual dialogues through text-to-image generation techniques.
The 20 Questions (Q20) game is a well known game which encourages deductive reasoning and creativity.