S+PAGE: A Speaker and Position-Aware Graph Neural Network Model for Emotion Recognition in Conversation

23 Dec 2021  ·  Chen Liang, Chong Yang, Jing Xu, Juyang Huang, Yongliang Wang, Yang Dong ·

Emotion recognition in conversation (ERC) has attracted much attention in recent years for its necessity in widespread applications. Existing ERC methods mostly model the self and inter-speaker context separately, posing a major issue for lacking enough interaction between them. In this paper, we propose a novel Speaker and Position-Aware Graph neural network model for ERC (S+PAGE), which contains three stages to combine the benefits of both Transformer and relational graph convolution network (R-GCN) for better contextual modeling. Firstly, a two-stream conversational Transformer is presented to extract the coarse self and inter-speaker contextual features for each utterance. Then, a speaker and position-aware conversation graph is constructed, and we propose an enhanced R-GCN model, called PAG, to refine the coarse features guided by a relative positional encoding. Finally, both of the features from the former two stages are input into a conditional random field layer to model the emotion transfer.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Result Benchmark
Emotion Recognition in Conversation DailyDialog S+PAGE Micro-F1 64.07 # 1
Emotion Recognition in Conversation EmoryNLP S+PAGE Weighted-F1 39.14 # 10
Emotion Recognition in Conversation IEMOCAP S+PAGE Weighted-F1 68.72 # 21
Emotion Recognition in Conversation MELD S+PAGE Weighted-F1 63.32 # 36

Methods