Contextualized Emotion Recognition in Conversation as Sequence Tagging

1 Jul 2020  ·  Yan Wang, Jiayu Zhang, Jun Ma, Shaojun Wang, Jing Xiao ·

Emotion recognition in conversation (ERC) is an important topic for developing empathetic machines in a variety of areas including social opinion mining, health-care and so on. In this paper, we propose a method to model ERC task as sequence tagging where a Conditional Random Field (CRF) layer is leveraged to learn the emotional consistency in the conversation. We employ LSTM-based encoders that capture self and inter-speaker dependency of interlocutors to generate contextualized utterance representations which are fed into the CRF layer. For capturing long-range global context, we use a multi-layer Transformer encoder to enhance the LSTM-based encoder. Experiments show that our method benefits from modeling the emotional consistency and outperforms the current state-of-the-art methods on multiple emotion classification datasets.

PDF Abstract
Task Dataset Model Metric Name Metric Value Global Rank Benchmark
Emotion Recognition in Conversation DailyDialog CESTa Micro-F1 63.12 # 2
Emotion Recognition in Conversation IEMOCAP CESTa Weighted-F1 67.1 # 14
Emotion Recognition in Conversation MELD CESTa Weighted-F1 58.36 # 39