Multimodal Sentiment Analysis
72 papers with code • 5 benchmarks • 7 datasets
Multimodal sentiment analysis is the task of performing sentiment analysis with multiple data sources - e.g. a camera feed of someone's face and their recorded speech.
( Image credit: ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection )
Libraries
Use these libraries to find Multimodal Sentiment Analysis models and implementationsDatasets
Most implemented papers
Found in Translation: Learning Robust Joint Representations by Cyclic Translations Between Modalities
Our method is based on the key insight that translation from a source to a target modality provides a method of learning joint representations using only the source modality as input.
MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis
In this paper, we aim to learn effective modality representations to aid the process of fusion.
Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis
On MOSI and MOSEI datasets, our method surpasses the current state-of-the-art methods.
Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis
Multimodal sentiment analysis aims to extract and integrate semantic information collected from multiple modalities to recognize the expressed emotions and sentiment in multimodal data.
Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis
In this work, we propose a framework named MultiModal InfoMax (MMIM), which hierarchically maximizes the Mutual Information (MI) in unimodal input pairs (inter-modality) and between multimodal fusion result and unimodal input in order to maintain task-related information through multimodal fusion.
TVLT: Textless Vision-Language Transformer
In this work, we present the Textless Vision-Language Transformer (TVLT), where homogeneous transformer blocks take raw visual and audio inputs for vision-and-language representation learning with minimal modality-specific design, and do not use text-specific modules such as tokenization or automatic speech recognition (ASR).
UniSA: Unified Generative Framework for Sentiment Analysis
Sentiment analysis is a crucial task that aims to understand people's emotional states and predict emotional categories based on multimodal information.
Select-Additive Learning: Improving Generalization in Multimodal Sentiment Analysis
In this paper, we propose a Select-Additive Learning (SAL) procedure that improves the generalizability of trained neural networks for multimodal sentiment analysis.
Tensor Fusion Network for Multimodal Sentiment Analysis
Multimodal sentiment analysis is an increasingly popular research area, which extends the conventional language-based definition of sentiment analysis to a multimodal setup where other relevant modalities accompany language.
Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling
Multimodal sentiment analysis is a very actively growing field of research.