no code implementations • 3 Jan 2024 • Wei Ai, FuChen Zhang, Tao Meng, Yuntao Shou, HongEn Shao, Keqin Li
To address the above issues, we propose a two-stage emotion recognition model based on graph contrastive learning (TS-GCL).
no code implementations • 28 Dec 2023 • Yuntao Shou, Tao Meng, Wei Ai, Keqin Li
However, the existing feature fusion methods have usually mapped the features of different modalities into the same feature space for information fusion, which can not eliminate the heterogeneity between different modalities.
no code implementations • 17 Dec 2023 • Wei Ai, Yuntao Shou, Tao Meng, Keqin Li
Specifically, we construct a weighted multi-relationship graph to simultaneously capture the dependencies between speakers and event relations in a dialogue.
no code implementations • 11 Dec 2023 • Tao Meng, Yuntao Shou, Wei Ai, Nan Yin, Keqin Li
The main task of Multimodal Emotion Recognition in Conversations (MERC) is to identify the emotions in modalities, e. g., text, audio, image and video, which is a significant development direction for realizing machine intelligence.
no code implementations • 10 Dec 2023 • Yuntao Shou, Tao Meng, Wei Ai, Nan Yin, Keqin Li
Unlike the traditional single-utterance multi-modal emotion recognition or single-modal conversation emotion recognition, MCER is a more challenging problem that needs to deal with more complex emotional interaction relationships.
no code implementations • 5 Dec 2023 • Yuntao Shou, Wei Ai, Tao Meng
Furthermore, this paper innovatively introduces information bottleneck theory into graph contrastive learning to maximize task-related information while minimizing task-independent redundant information.
no code implementations • 4 Dec 2023 • Yuntao Shou, Wei Ai, Tao Meng, Keqin Li
Zero-shot age estimation aims to learn feature information about age from input images and make inferences about a given person's image or video frame without specific sample data.
no code implementations • 16 Jun 2023 • Yuntao Shou, Xiangyong Cao, Deyu Meng, Bo Dong, Qinghua Zheng
By setting a matching weight and calculating attention scores between modal features row by row, LMAM contains fewer parameters than the self-attention method.