Multimodal Sentiment Analysis
69 papers with code • 5 benchmarks • 6 datasets
Multimodal sentiment analysis is the task of performing sentiment analysis with multiple data sources - e.g. a camera feed of someone's face and their recorded speech.
LibrariesUse these libraries to find Multimodal Sentiment Analysis models and implementations
Speech emotion recognition is a challenging task, and extensive reliance has been placed on models that use audio features in building well-performing classifiers.
Human language is often multimodal, which comprehends a mixture of natural language, facial gestures, and acoustic behaviors.
Humans convey their intentions through the usage of both verbal and nonverbal behaviors during face-to-face communication.
Therefore, in this paper, based on audio and text, we consider the task of multimodal sentiment analysis and propose a novel fusion strategy including both multi-feature fusion and multi-modality fusion to improve the accuracy of audio-text sentiment analysis.
The platform features a fully modular video sentiment analysis framework consisting of data management, feature extraction, model training, and result analysis modules.
In this paper, we propose the Gated Multimodal Embedding LSTM with Temporal Attention (GME-LSTM(A)) model that is composed of 2 modules.
We propose a novel approach to multimodal sentiment analysis using deep neural networks combining visual analysis and natural language processing.