Specifically, we first embed the multimodal features into a unified Transformer semantic space to prompt inter-modal interactions, and then devise a feature alignment and intention reasoning (FAIR) layer to perform cross-modal entity alignment and fine-grained key-value reasoning, so as to effectively identify user’s intention for generating more accurate responses.
Based on such a mechanism, we further propose an intention reasoning network (IR-Net), which consists of joint and multi-hop reasoning, to obtain intention-aware representations of conceptual tokens that can be used to capture the concept shifts involved in task-oriented conversations, so as to effectively identify user’s intention and generate more accurate responses.
The model first trains the sentence pairs in the text, calculate similarity between words in sentence pairs by cosine similarity, and fine-tunes the BERT model used for the named entity recognition task through the similarity, so as to alleviate word ambiguity.
Afterward, by performing a simplified RGCN-based node information propagation on the constructed heterogeneous graph, the embeddings of users and items can be adjusted with textual knowledge, which effectively alleviates the negative effects of data sparsity.
IRSs usually face the large discrete action space problem, which makes most of the existing RL-based recommendation methods inefficient.
The key of Weakly Supervised Fine-grained Image Classification (WFGIC) is how to pick out the discriminative regions and learn the discriminative features from them.
Ranked #15 on Fine-Grained Image Classification on FGVC Aircraft
To design more efficient lightweight concolutional neural netwok, Depthwise-Pointwise-Depthwise inverted bottleneck block (DPD block) is proposed and DPDNet is designed by stacking DPD block.
We cast retrosynthesis as a machine translation problem by introducing a special Tensor2Tensor, an entire attention-based and fully data-driven model.
In this paper, we propose a novel network design mechanism for efficient embedded computing.
Ranked #4 on Face Verification on CFP-FP