Multimodal Sentiment Analysis

72 papers with code • 5 benchmarks • 7 datasets

Multimodal sentiment analysis is the task of performing sentiment analysis with multiple data sources - e.g. a camera feed of someone's face and their recorded speech.

( Image credit: ICON: Interactive Conversational Memory Network for Multimodal Emotion Detection )

Libraries

Use these libraries to find Multimodal Sentiment Analysis models and implementations
3 papers
570

Most implemented papers

Found in Translation: Learning Robust Joint Representations by Cyclic Translations Between Modalities

hainow/MCTN 19 Dec 2018

Our method is based on the key insight that translation from a source to a target modality provides a method of learning joint representations using only the source modality as input.

MISA: Modality-Invariant and -Specific Representations for Multimodal Sentiment Analysis

declare-lab/MISA 7 May 2020

In this paper, we aim to learn effective modality representations to aid the process of fusion.

Learning Modality-Specific Representations with Self-Supervised Multi-Task Learning for Multimodal Sentiment Analysis

thuiar/Self-MM 9 Feb 2021

On MOSI and MOSEI datasets, our method surpasses the current state-of-the-art methods.

Bi-Bimodal Modality Fusion for Correlation-Controlled Multimodal Sentiment Analysis

declare-lab/multimodal-deep-learning 28 Jul 2021

Multimodal sentiment analysis aims to extract and integrate semantic information collected from multiple modalities to recognize the expressed emotions and sentiment in multimodal data.

Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis

declare-lab/multimodal-deep-learning EMNLP 2021

In this work, we propose a framework named MultiModal InfoMax (MMIM), which hierarchically maximizes the Mutual Information (MI) in unimodal input pairs (inter-modality) and between multimodal fusion result and unimodal input in order to maintain task-related information through multimodal fusion.

TVLT: Textless Vision-Language Transformer

huggingface/transformers 28 Sep 2022

In this work, we present the Textless Vision-Language Transformer (TVLT), where homogeneous transformer blocks take raw visual and audio inputs for vision-and-language representation learning with minimal modality-specific design, and do not use text-specific modules such as tokenization or automatic speech recognition (ASR).

UniSA: Unified Generative Framework for Sentiment Analysis

dawn0815/UniSA 4 Sep 2023

Sentiment analysis is a crucial task that aims to understand people's emotional states and predict emotional categories based on multimodal information.

Select-Additive Learning: Improving Generalization in Multimodal Sentiment Analysis

HaohanWang/SelectAdditiveLearning 16 Sep 2016

In this paper, we propose a Select-Additive Learning (SAL) procedure that improves the generalizability of trained neural networks for multimodal sentiment analysis.

Tensor Fusion Network for Multimodal Sentiment Analysis

Justin1904/TensorFusionNetworks EMNLP 2017

Multimodal sentiment analysis is an increasingly popular research area, which extends the conventional language-based definition of sentiment analysis to a multimodal setup where other relevant modalities accompany language.

Multimodal Sentiment Analysis using Hierarchical Fusion with Context Modeling

SenticNet/hfusion 16 Jun 2018

Multimodal sentiment analysis is a very actively growing field of research.