TASK	DATASET	MODEL	METRIC NAME	METRIC VALUE	GLOBAL RANK
Emotion Recognition in Conversation	CMU-MOSI	Audio + Text (Stage III)	F1 score	0.858	# 1
Multimodal Emotion Recognition	IEMOCAP	Audio + Text (Stage III)	F1	0.705	# 8
Multimodal Emotion Recognition	MELD	Audio + Text (Stage III)	F1	65.8	# 1
Emotion Recognition in Conversation	MELD	Audio + Text (Stage III)	Weighted-F1	65.8	# 20

Badge	Markdown
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hcam-hierarchical-cross-attention-model-for/emotion-recognition-in-conversation-on-cmu)](https://paperswithcode.com/sota/emotion-recognition-in-conversation-on-cmu?p=hcam-hierarchical-cross-attention-model-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hcam-hierarchical-cross-attention-model-for/multimodal-emotion-recognition-on-meld)](https://paperswithcode.com/sota/multimodal-emotion-recognition-on-meld?p=hcam-hierarchical-cross-attention-model-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hcam-hierarchical-cross-attention-model-for/multimodal-emotion-recognition-on-iemocap)](https://paperswithcode.com/sota/multimodal-emotion-recognition-on-iemocap?p=hcam-hierarchical-cross-attention-model-for)`
	`[![PWC](https://img.shields.io/endpoint.svg?url=https://paperswithcode.com/badge/hcam-hierarchical-cross-attention-model-for/emotion-recognition-in-conversation-on-meld)](https://paperswithcode.com/sota/emotion-recognition-in-conversation-on-meld?p=hcam-hierarchical-cross-attention-model-for)`

HCAM -- Hierarchical Cross Attention Model for Multi-modal Emotion Recognition

14 Apr 2023 · Soumya Dutta, Sriram Ganapathy ·

Emotion recognition in conversations is challenging due to the multi-modal nature of the emotion expression. We propose a hierarchical cross-attention model (HCAM) approach to multi-modal emotion recognition using a combination of recurrent and co-attention neural network models. The input to the model consists of two modalities, i) audio data, processed through a learnable wav2vec approach and, ii) text data represented using a bidirectional encoder representations from transformers (BERT) model. The audio and text representations are processed using a set of bi-directional recurrent neural network layers with self-attention that converts each utterance in a given conversation to a fixed dimensional embedding. In order to incorporate contextual knowledge and the information across the two modalities, the audio and text embeddings are combined using a co-attention layer that attempts to weigh the utterance level embeddings relevant to the task of emotion recognition. The neural network parameters in the audio layers, text layers as well as the multi-modal co-attention layers, are hierarchically trained for the emotion classification task. We perform experiments on three established datasets namely, IEMOCAP, MELD and CMU-MOSI, where we illustrate that the proposed model improves significantly over other benchmarks and helps achieve state-of-art results on all these datasets.

PDF Abstract

Code

Add Remove Mark official

No code implementations yet. Submit your code now

Tasks

Add Remove

Emotion Classification

Emotion Recognition

Emotion Recognition in Conversation

Multimodal Emotion Recognition

Datasets

IEMOCAP

MELD

CMU-MOSI

Results from the Paper

Edit

Ranked #1 on Multimodal Emotion Recognition on MELD

Get a GitHub badge

Task	Dataset	Model	Metric Name	Metric Value	Global Rank	Benchmark
Emotion Recognition in Conversation	CMU-MOSI	Audio + Text (Stage III)	F1 score	0.858	# 1	Compare
Multimodal Emotion Recognition	IEMOCAP	Audio + Text (Stage III)	F1	0.705	# 8	Compare
Multimodal Emotion Recognition	MELD	Audio + Text (Stage III)	F1	65.8	# 1	Compare
Emotion Recognition in Conversation	MELD	Audio + Text (Stage III)	Weighted-F1	65.8	# 20	Compare

Methods

Add Remove

No methods listed for this paper. Add relevant methods here

Edit Social Preview

HCAM -- Hierarchical Cross Attention Model for Multi-modal Emotion Recognition

Code Edit Add Remove Mark official

Tasks Edit Add Remove

Datasets Edit

Results from the Paper Edit

Methods Edit Add Remove

Code

Add Remove Mark official

Tasks

Add Remove

Datasets

Results from the Paper

Edit

Methods

Add Remove