Contextualized Emotion Recognition in Conversation as Sequence Tagging
Emotion recognition in conversation (ERC) is an important topic for developing empathetic machines in a variety of areas including social opinion mining, health-care and so on. In this paper, we propose a method to model ERC task as sequence tagging where a Conditional Random Field (CRF) layer is leveraged to learn the emotional consistency in the conversation. We employ LSTM-based encoders that capture self and inter-speaker dependency of interlocutors to generate contextualized utterance representations which are fed into the CRF layer. For capturing long-range global context, we use a multi-layer Transformer encoder to enhance the LSTM-based encoder. Experiments show that our method benefits from modeling the emotional consistency and outperforms the current state-of-the-art methods on multiple emotion classification datasets.
PDF AbstractDatasets
Results from the Paper
Task | Dataset | Model | Metric Name | Metric Value | Global Rank | Benchmark |
---|---|---|---|---|---|---|
Emotion Recognition in Conversation | DailyDialog | CESTa | Micro-F1 | 63.12 | # 2 | |
Emotion Recognition in Conversation | IEMOCAP | CESTa | Weighted-F1 | 67.1 | # 14 | |
Emotion Recognition in Conversation | MELD | CESTa | Weighted-F1 | 58.36 | # 39 |