319 papers with code • 5 benchmarks • 40 datasets
Emotion Recognition is an important area of research to enable effective human-computer interaction. Human emotions can be detected using speech signal, facial expressions, body language, and electroencephalography (EEG). Source: Using Deep Autoencoders for Facial Expression Recognition
We propose several strong multimodal baselines and show the importance of contextual and multimodal information for emotion recognition in conversations.
Humans convey their intentions through the usage of both verbal and nonverbal behaviors during face-to-face communication.
Speech emotion recognition is a challenging task, and extensive reliance has been placed on models that use audio features in building well-performing classifiers.
Emotion cause extraction (ECE), the task aimed at extracting the potential causes behind certain emotions in text, has gained much attention in recent years due to its wide applications.
The proposed architecture achieves 99. 6% for CKP and 98. 63% for MMI, therefore performing better than the state of the art using CNNs.
Therefore, in this paper, based on audio and text, we consider the task of multimodal sentiment analysis and propose a novel fusion strategy including both multi-feature fusion and multi-modality fusion to improve the accuracy of audio-text sentiment analysis.