no code implementations • NAACL (ACL) 2022 • Louis-Philippe Morency, Paul Pu Liang, Amir Zadeh
Multimodal machine learning involves integrating and modeling information from multiple heterogeneous sources of data.
1 code implementation • 30 Mar 2025 • Weisheng Jin, Maojia Song, Tej Deep Pala, Yew Ken Chia, Amir Zadeh, Chuan Li, Soujanya Poria
To address this, we propose PromptDistill, a novel, training-free method that improves inference efficiency while preserving generation quality.
no code implementations • 5 Jun 2024 • Masum Hasan, Cengiz Ozel, Nina Long, Alexander Martin, Samuel Potter, Tariq Adnan, Sangwu Lee, Amir Zadeh, Ehsan Hoque
We propose a new large synthetic hand pose estimation dataset, Hi5, and a novel inexpensive method for collecting high-quality synthetic data that requires no human annotation or validation.
1 code implementation • 2 Mar 2023 • Yingting Li, Ambuj Mehrish, Shuai Zhao, Rishabh Bhardwaj, Amir Zadeh, Navonil Majumder, Rada Mihalcea, Soujanya Poria
To mitigate this issue, parameter-efficient transfer learning algorithms, such as adapters and prefix tuning, have been proposed as a way to introduce a few trainable parameters that can be plugged into large pre-trained language models such as BERT, and HuBERT.
no code implementations • 7 Sep 2022 • Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency
With the recent interest in video understanding, embodied autonomous agents, text-to-image generation, and multisensor fusion in application domains such as healthcare and robotics, multimodal machine learning has brought unique computational and theoretical challenges to the machine learning community given the heterogeneity of data sources and the interconnections often found between modalities.
no code implementations • 29 Jul 2022 • Alex Wilf, Martin Q. Ma, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency
Creating artificial social intelligence - algorithms that can understand the nuances of multi-person interactions - is an exciting and emerging challenge in processing facial expressions and gestures from multimodal videos.
no code implementations • 26 Oct 2021 • Amir Zadeh, Santiago Benoit, Louis-Philippe Morency
We find RVI to be a unique tool, often superior in both performance and convergence speed to previously proposed encoderless as well as amortized VI models (e. g. VAE).
1 code implementation • 3 Aug 2021 • Dushyant Singh Chauhan, Gopendra Vikram Singh, Navonil Majumder, Amir Zadeh, Asif Ekbal, Pushpak Bhattacharyya, Louis-Philippe Morency, Soujanya Poria
We propose several strong multimodal baselines and show the importance of contextual and multimodal information for humor recognition in conversations.
2 code implementations • 28 Jul 2021 • Wei Han, Hui Chen, Alexander Gelbukh, Amir Zadeh, Louis-Philippe Morency, Soujanya Poria
Multimodal sentiment analysis aims to extract and integrate semantic information collected from multiple modalities to recognize the expressed emotions and sentiment in multimodal data.
no code implementations • 3 Jan 2021 • Amir Zadeh, Santiago Benoit, Louis-Philippe Morency
In this paper we present an approach for training deep generative models solely based on solving determined systems of linear equations.
1 code implementation • NAACL 2021 • Jianing Yang, Yongxin Wang, Ruitao Yi, Yuying Zhu, Azaan Rehman, Amir Zadeh, Soujanya Poria, Louis-Philippe Morency
Human communication is multimodal in nature; it is through multiple modalities such as language, voice, and facial expressions, that opinions and emotions are expressed.
no code implementations • 19 Oct 2020 • Shagun Uppal, Sarthak Bhagat, Devamanyu Hazarika, Navonil Majumdar, Soujanya Poria, Roger Zimmermann, Amir Zadeh
Deep Learning and its applications have cascaded impactful research and development with a diverse range of modalities present in the real-world data.
no code implementations • 7 Jul 2020 • Jianing Yang, Yuying Zhu, Yongxin Wang, Ruitao Yi, Amir Zadeh, Louis-Philippe Morency
In this paper, we analyze QA biases in popular video question answering datasets and discover pretrained language models can answer 37-48% questions correctly without using any multimodal context information, far exceeding the 20% random guess baseline for 5-choose-1 multiple-choice questions.
no code implementations • 3 May 2020 • Navonil Majumder, Rishabh Bhardwaj, Soujanya Poria, Amir Zadeh, Alexander Gelbukh, Amir Hussain, Louis-Philippe Morency
Aspect-based sentiment analysis (ABSA), a popular research area in NLP has two distinct parts -- aspect extraction (AE) and labeling the aspects with sentiment polarity (ALSA).
no code implementations • 19 Dec 2019 • Amir Zadeh, Smon Hessner, Yao-Chong Lim, Louis-Phlippe Morency
Posterior inference in directed graphical models is commonly done using a probabilistic encoder (a. k. a inference model) conditioned on the input.
no code implementations • 22 Nov 2019 • Amir Zadeh, Chengfeng Mao, Kelly Shi, Yiwei Zhang, Paul Pu Liang, Soujanya Poria, Louis-Philippe Morency
As machine learning leaps towards better generalization to real world, multimodal sequential learning becomes a fundamental research area.
no code implementations • 21 Nov 2019 • Amir Zadeh, Tianjun Ma, Soujanya Poria, Louis-Philippe Morency
To this end, we introduce a novel trasnformer-based model called Spectro-Temporal Transformer (STT).
1 code implementation • ACL 2020 • Wasifur Rahman, Md. Kamrul Hasan, Sangwu Lee, Amir Zadeh, Chengfeng Mao, Louis-Philippe Morency, Ehsan Hoque
It does so by generating a shift to internal representation of BERT and XLNet; a shift that is conditioned on the visual and acoustic modalities.
no code implementations • CVPR 2019 • Amir Zadeh, Michael Chan, Paul Pu Liang, Edmund Tong, Louis-Philippe Morency
Human language offers a unique unconstrained approach to probe through questions and reason through answers about social situations.
no code implementations • IJCNLP 2019 • Md. Kamrul Hasan, Wasifur Rahman, Amir Zadeh, Jianyuan Zhong, Md. Iftekhar Tanveer, Louis-Philippe Morency, Mohammed, Hoque
The dataset and accompanying studies, present a framework in multimodal humor detection for the natural language processing community.
no code implementations • 3 Mar 2019 • Amir Zadeh, Yao-Chong Lim, Paul Pu Liang, Louis-Philippe Morency
We study a specific implementation of the Auto-Encoding Variational Bayes (AEVB) algorithm, named in this paper as a Variational Auto-Decoder (VAD).
4 code implementations • 23 Nov 2018 • Yansen Wang, Ying Shen, Zhun Liu, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency
Humans convey their intentions through the usage of both verbal and nonverbal behaviors during face-to-face communication.
1 code implementation • EMNLP 2018 • Paul Pu Liang, Ziyin Liu, Amir Zadeh, Louis-Philippe Morency
In this paper, we propose the Recurrent Multistage Fusion Network (RMFN) which decomposes the fusion problem into multiple stages, each of them focused on a subset of multimodal signals for specialized, effective fusion.
2 code implementations • ICLR 2019 • Yao-Hung Hubert Tsai, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency, Ruslan Salakhutdinov
Multimodal discriminative factors are shared across all modalities and contain joint multimodal features required for discriminative tasks such as sentiment prediction.
1 code implementation • NAACL 2018 • Devamanyu Hazarika, Soujanya Poria, Amir Zadeh, Erik Cambria, Louis-Philippe Morency, Roger Zimmermann
Emotion recognition in conversations is crucial for the development of empathetic machines.
Ranked #57 on
Emotion Recognition in Conversation
on IEMOCAP
3 code implementations • ACL 2018 • Zhun Liu, Ying Shen, Varun Bharadhwaj Lakshminarasimhan, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency
Previous research in this field has exploited the expressiveness of tensors for multimodal representation.
2 code implementations • 3 Feb 2018 • Amir Zadeh, Paul Pu Liang, Navonil Mazumder, Soujanya Poria, Erik Cambria, Louis-Philippe Morency
In this paper, we present a new neural architecture for multi-view sequential learning called the Memory Fusion Network (MFN) that explicitly accounts for both interactions in a neural architecture and continuously models them through time.
2 code implementations • 3 Feb 2018 • Amir Zadeh, Paul Pu Liang, Soujanya Poria, Prateek Vij, Erik Cambria, Louis-Philippe Morency
AI must understand each modality and the interactions between them that shape human communication.
Ranked #10 on
Multimodal Sentiment Analysis
on MOSI
2 code implementations • 3 Feb 2018 • Minghai Chen, Sen Wang, Paul Pu Liang, Tadas Baltrušaitis, Amir Zadeh, Louis-Philippe Morency
In this paper, we propose the Gated Multimodal Embedding LSTM with Temporal Attention (GME-LSTM(A)) model that is composed of 2 modules.
2 code implementations • EMNLP 2017 • Amir Zadeh, Minghai Chen, Soujanya Poria, Erik Cambria, Louis-Philippe Morency
Multimodal sentiment analysis is an increasingly popular research area, which extends the conventional language-based definition of sentiment analysis to a multimodal setup where other relevant modalities accompany language.
2 code implementations • ACL 2017 • Soujanya Poria, Erik Cambria, Devamanyu Hazarika, Navonil Majumder, Amir Zadeh, Louis-Philippe Morency
Multimodal sentiment analysis is a developing area of research, which involves the identification of sentiments in videos.
Ranked #3 on
Emotion Recognition in Conversation
on CPED
Emotion Recognition in Conversation
General Classification
+4
no code implementations • ACL 2017 • Edmund Tong, Amir Zadeh, Cara Jones, Louis-Philippe Morency
Human trafficking is a global epidemic affecting millions of people across the planet.
no code implementations • 8 May 2017 • Edmund Tong, Amir Zadeh, Cara Jones, Louis-Philippe Morency
Human trafficking is a global epidemic affecting millions of people across the planet.
1 code implementation • 26 Nov 2016 • Amir Zadeh, Tadas Baltrušaitis, Louis-Philippe Morency
In our work, we present a novel local detector -- Convolutional Experts Network (CEN) -- that brings together the advantages of neural architectures and mixtures of experts in an end-to-end framework.
5 code implementations • 20 Jun 2016 • Amir Zadeh, Rowan Zellers, Eli Pincus, Louis-Philippe Morency
This paper introduces to the scientific community the first opinion-level annotated corpus of sentiment and subjectivity analysis in online videos called Multimodal Opinion-level Sentiment Intensity dataset (MOSI).