Search Results for author: Paul Pu Liang

Found 70 papers, 41 papers with code

Tutorial on Multimodal Machine Learning

no code implementations • NAACL (ACL) 2022 • Louis-Philippe Morency, Paul Pu Liang, Amir Zadeh

Multimodal machine learning involves integrating and modeling information from multiple heterogeneous sources of data.

Paper
Add Code

Diverse and Admissible Trajectory Prediction through Multimodal Context Understanding

1 code implementation • ECCV 2020 • Seong Hyeon Park, Gyubok Lee, Jimin Seo, Manoj Bhat, Minseok Kang, Jonathan Francis, Ashwin Jadhav, Paul Pu Liang, Louis-Philippe Morency

Multi-agent trajectory forecasting in autonomous driving requires an agent to accurately anticipate the behaviors of the surrounding vehicles and pedestrians, for safe and reliable decision-making.

Autonomous Driving Decision Making +1

Paper
Code

CMU-MOSEAS: A Multimodal Language Dataset for Spanish, Portuguese, German and French

no code implementations • EMNLP 2020 • AmirAli Bagher Zadeh, Yansheng Cao, Simon Hessner, Paul Pu Liang, Soujanya Poria, Louis-Philippe Morency

It covers a diverse set topics and speakers, and carries supervision of 20 labels including sentiment (and subjectivity), emotions, and attributes.

Paper
Add Code

Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions

no code implementations • 17 Apr 2024 • Leena Mathur, Paul Pu Liang, Louis-Philippe Morency

Building socially-intelligent AI agents (Social-AI) is a multidisciplinary, multimodal research goal that involves creating agents that can sense, perceive, reason about, learn from, and respond to affect, behavior, and cognition of other agents (human or artificial).

Position

Paper
Add Code

Localized Symbolic Knowledge Distillation for Visual Commonsense Models

2 code implementations • NeurIPS 2023 • Jae Sung Park, Jack Hessel, Khyathi Raghavi Chandu, Paul Pu Liang, Ximing Lu, Peter West, Youngjae Yu, Qiuyuan Huang, Jianfeng Gao, Ali Farhadi, Yejin Choi

Empirical results and human evaluations in a zero-shot setup demonstrate that our distillation method results in more precise VL models of reasoning compared to a baseline of passing a generated referring expression to an LLM.

Instruction Following Knowledge Distillation +3

Paper
Code

MMOE: Mixture of Multimodal Interaction Experts

no code implementations • 16 Nov 2023 • Haofei Yu, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency

Multimodal machine learning, which studies the information and interactions across various input modalities, has made significant advancements in understanding the relationship between images and descriptive text.

Binary Classification Descriptive

Paper
Add Code

Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities

1 code implementation • 16 Nov 2023 • Alex Wilf, Sihyun Shawn Lee, Paul Pu Liang, Louis-Philippe Morency

Human interactions are deeply rooted in the interplay of thoughts, beliefs, and desires made possible by Theory of Mind (ToM): our cognitive ability to understand the mental states of ourselves and others.

Paper
Code

MultiIoT: Towards Large-scale Multisensory Learning for the Internet of Things

no code implementations • 10 Nov 2023 • Shentong Mo, Paul Pu Liang, Russ Salakhutdinov, Louis-Philippe Morency

The Internet of Things (IoT), the network integrating billions of smart physical devices embedded with sensors, software, and communication technologies for the purpose of connecting and exchanging data with other devices and systems, is a critical and rapidly expanding component of our modern world.

Representation Learning

Paper
Add Code

Comparative Knowledge Distillation

1 code implementation • 3 Nov 2023 • Alex Wilf, Alex Tianyi Xu, Paul Pu Liang, Alexander Obolenskiy, Daniel Fried, Louis-Philippe Morency

We observe that prevalent KD techniques and state of the art data augmentation strategies fall short in this constrained setting.

Data Augmentation Knowledge Distillation

Paper
Code

Towards Vision-Language Mechanistic Interpretability: A Causal Tracing Tool for BLIP

1 code implementation • 27 Aug 2023 • Vedant Palit, Rohan Pandey, Aryaman Arora, Paul Pu Liang

Furthermore, we release our BLIP causal tracing tool as open source to enable further experimentation in vision-language mechanistic interpretability by the community.

Question Answering Text Generation +1

Paper
Code

MultiZoo & MultiBench: A Standardized Toolkit for Multimodal Deep Learning

1 code implementation • 28 Jun 2023 • Paul Pu Liang, Yiwei Lyu, Xiang Fan, Arav Agarwal, Yun Cheng, Louis-Philippe Morency, Ruslan Salakhutdinov

Learning multimodal representations involves integrating information from multiple heterogeneous sources of data.

Multimodal Deep Learning

425

Paper
Code

Factorized Contrastive Learning: Going Beyond Multi-view Redundancy

1 code implementation • NeurIPS 2023 • Paul Pu Liang, Zihao Deng, Martin Ma, James Zou, Louis-Philippe Morency, Ruslan Salakhutdinov

How can we learn self-supervised multimodal representations to capture both shared and unique information relevant to downstream tasks?

Contrastive Learning Representation Learning

Paper
Code

Multimodal Fusion Interactions: A Study of Human and Automatic Quantification

1 code implementation • 7 Jun 2023 • Paul Pu Liang, Yun Cheng, Ruslan Salakhutdinov, Louis-Philippe Morency

In order to perform multimodal fusion of heterogeneous signals, we need to understand their interactions: how each modality individually provides information useful for a task and how this information changes in the presence of other modalities.

counterfactual

Paper
Code

Multimodal Learning Without Labeled Multimodal Data: Guarantees and Applications

1 code implementation • 7 Jun 2023 • Paul Pu Liang, Chun Kai Ling, Yun Cheng, Alex Obolenskiy, Yudong Liu, Rohan Pandey, Alex Wilf, Louis-Philippe Morency, Ruslan Salakhutdinov

We propose two lower bounds based on the amount of shared information between modalities and the disagreement between separately trained unimodal classifiers, and derive an upper bound through connections to approximate algorithms for min-entropy couplings.

Self-Supervised Learning

Paper
Code

Language Models Get a Gender Makeover: Mitigating Gender Bias with Few-Shot Data Interventions

no code implementations • 7 Jun 2023 • Himanshu Thakur, Atishay Jain, Praneetha Vaddamanu, Paul Pu Liang, Louis-Philippe Morency

Since large-scale retraining of these models from scratch is both time and compute-expensive, a variety of approaches have been previously proposed that de-bias a pre-trained model.

Language Modelling

Paper
Add Code

Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical Fusion for Multimodal Affect Recognition

no code implementations • 23 May 2023 • Yaoting Wang, Yuanchao Li, Paul Pu Liang, Louis-Philippe Morency, Peter Bell, Catherine Lai

Fusing multiple modalities has proven effective for multimodal information processing.

Emotion Recognition Multimodal Sentiment Analysis

Paper
Add Code

Difference-Masking: Choosing What to Mask in Continued Pretraining

1 code implementation • 23 May 2023 • Alex Wilf, Syeda Nahida Akter, Leena Mathur, Paul Pu Liang, Sheryl Mathew, Mengrou Shou, Eric Nyberg, Louis-Philippe Morency

The self-supervised objective of masking-and-predicting has led to promising performance gains on a variety of downstream tasks.

Self-Supervised Learning

Paper
Code

HIINT: Historical, Intra- and Inter- personal Dynamics Modeling with Cross-person Memory Transformer

no code implementations • 21 May 2023 • Yubin Kim, Dong Won Lee, Paul Pu Liang, Sharifa Algohwinem, Cynthia Breazeal, Hae Won Park

Accurately modeling affect dynamics, which refers to the changes and fluctuations in emotions and affective displays during human conversations, is crucial for understanding human interactions.

Language Modelling Large Language Model

Paper
Add Code

Quantifying & Modeling Multimodal Interactions: An Information Decomposition Framework

1 code implementation • NeurIPS 2023 • Paul Pu Liang, Yun Cheng, Xiang Fan, Chun Kai Ling, Suzanne Nie, Richard Chen, Zihao Deng, Nicholas Allen, Randy Auerbach, Faisal Mahmood, Ruslan Salakhutdinov, Louis-Philippe Morency

The recent explosion of interest in multimodal applications has resulted in a wide selection of datasets and methods for representing and integrating information from different modalities.

Model Selection

Paper
Code

Lecture Presentations Multimodal Dataset: Towards Understanding Multimodality in Educational Videos

no code implementations • ICCV 2023 • Dong Won Lee, Chaitanya Ahuja, Paul Pu Liang, Sanika Natu, Louis-Philippe Morency

We introduce three research tasks, (1) figure-to-text retrieval, (2) text-to-figure retrieval, and (3) generation of slide explanations, which are grounded in multimedia learning and psychology principles to test a vision-language model's understanding of multimodal content.

Attribute Retrieval +1

Paper
Add Code

Cross-modal Attention Congruence Regularization for Vision-Language Relation Alignment

1 code implementation • 20 Dec 2022 • Rohan Pandey, Rulin Shao, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency

To tackle this problem, we show that relation alignment can be enforced by encouraging the directed language attention from 'mug' to 'grass' (capturing the semantic relation 'in') to match the directed visual attention from the mug to the grass.

Relation Visual Reasoning

Paper
Code

Does Structural Attention Improve Compositional Representations in Vision-Language Models?

no code implementations • NeurIPS Workshop: Self-Supervised Learning - Theory and Practice 2022 • Rohan Pandey, Rulin Shao, Paul Pu Liang, Louis-Philippe Morency

Although scaling self-supervised approaches has gained widespread success in Vision-Language pre-training, a number of works providing structural knowledge of visually-grounded semantics have recently shown incremental performance gains.

Ranked #27 on Visual Reasoning on Winoground

Visual Reasoning

Paper
Add Code

Nano: Nested Human-in-the-Loop Reward Learning for Few-shot Language Model Control

1 code implementation • 10 Nov 2022 • Xiang Fan, Yiwei Lyu, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency

Existing techniques for controlling the distribution of generated text only work with quantified distributions, which require pre-defined categories, proportions of the distribution, or an existing corpus following the desired distributions.

Attribute Fairness +2

Paper
Code

Uncertainty Quantification with Pre-trained Language Models: A Large-Scale Empirical Analysis

1 code implementation • 10 Oct 2022 • Yuxin Xiao, Paul Pu Liang, Umang Bhatt, Willie Neiswanger, Ruslan Salakhutdinov, Louis-Philippe Morency

In particular, there are various considerations behind the pipeline: (1) the choice and (2) the size of PLM, (3) the choice of uncertainty quantifier, (4) the choice of fine-tuning loss, and many more.

Uncertainty Quantification

Paper
Code

Foundations and Trends in Multimodal Machine Learning: Principles, Challenges, and Open Questions

no code implementations • 7 Sep 2022 • Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency

With the recent interest in video understanding, embodied autonomous agents, text-to-image generation, and multisensor fusion in application domains such as healthcare and robotics, multimodal machine learning has brought unique computational and theoretical challenges to the machine learning community given the heterogeneity of data sources and the interconnections often found between modalities.

Text-to-Image Generation Video Understanding

Paper
Add Code

Multimodal Lecture Presentations Dataset: Understanding Multimodality in Educational Slides

2 code implementations • 17 Aug 2022 • Dong Won Lee, Chaitanya Ahuja, Paul Pu Liang, Sanika Natu, Louis-Philippe Morency

As a step toward developing AI to aid in student learning as intelligent teacher assistants, we introduce the Multimodal Lecture Presentations dataset as a large-scale benchmark testing the capabilities of machine learning models in multimodal understanding of educational content.

Attribute

Paper
Code

Face-to-Face Contrastive Learning for Social Intelligence Question-Answering

no code implementations • 29 Jul 2022 • Alex Wilf, Martin Q. Ma, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency

Creating artificial social intelligence - algorithms that can understand the nuances of multi-person interactions - is an exciting and emerging challenge in processing facial expressions and gestures from multimodal videos.

Contrastive Learning Question Answering

Paper
Add Code

MultiViz: Towards Visualizing and Understanding Multimodal Models

1 code implementation • 30 Jun 2022 • Paul Pu Liang, Yiwei Lyu, Gunjan Chhablani, Nihal Jain, Zihao Deng, Xingbo Wang, Louis-Philippe Morency, Ruslan Salakhutdinov

How can we visualize the internal modeling of multimodal interactions in these models?

Paper
Code

GEMv2: Multilingual NLG Benchmarking in a Single Line of Code

no code implementations • 22 Jun 2022 • Sebastian Gehrmann, Abhik Bhattacharjee, Abinaya Mahendiran, Alex Wang, Alexandros Papangelis, Aman Madaan, Angelina McMillan-Major, Anna Shvets, Ashish Upadhyay, Bingsheng Yao, Bryan Wilie, Chandra Bhagavatula, Chaobin You, Craig Thomson, Cristina Garbacea, Dakuo Wang, Daniel Deutsch, Deyi Xiong, Di Jin, Dimitra Gkatzia, Dragomir Radev, Elizabeth Clark, Esin Durmus, Faisal Ladhak, Filip Ginter, Genta Indra Winata, Hendrik Strobelt, Hiroaki Hayashi, Jekaterina Novikova, Jenna Kanerva, Jenny Chim, Jiawei Zhou, Jordan Clive, Joshua Maynez, João Sedoc, Juraj Juraska, Kaustubh Dhole, Khyathi Raghavi Chandu, Laura Perez-Beltrachini, Leonardo F. R. Ribeiro, Lewis Tunstall, Li Zhang, Mahima Pushkarna, Mathias Creutz, Michael White, Mihir Sanjay Kale, Moussa Kamal Eddine, Nico Daheim, Nishant Subramani, Ondrej Dusek, Paul Pu Liang, Pawan Sasanka Ammanamanchi, Qi Zhu, Ratish Puduppully, Reno Kriz, Rifat Shahriyar, Ronald Cardenas, Saad Mahamood, Salomey Osei, Samuel Cahyawijaya, Sanja Štajner, Sebastien Montella, Shailza, Shailza Jolly, Simon Mille, Tahmid Hasan, Tianhao Shen, Tosin Adewumi, Vikas Raunak, Vipul Raheja, Vitaly Nikolaev, Vivian Tsai, Yacine Jernite, Ying Xu, Yisi Sang, Yixin Liu, Yufang Hou

This problem is especially pertinent in natural language generation which requires ever-improving suites of datasets, metrics, and human evaluation to make definitive claims.

Benchmarking Text Generation

Paper
Add Code

Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models

3 code implementations • 9 Jun 2022 • Aarohi Srivastava, Abhinav Rastogi, Abhishek Rao, Abu Awal Md Shoeb, Abubakar Abid, Adam Fisch, Adam R. Brown, Adam Santoro, Aditya Gupta, Adrià Garriga-Alonso, Agnieszka Kluska, Aitor Lewkowycz, Akshat Agarwal, Alethea Power, Alex Ray, Alex Warstadt, Alexander W. Kocurek, Ali Safaya, Ali Tazarv, Alice Xiang, Alicia Parrish, Allen Nie, Aman Hussain, Amanda Askell, Amanda Dsouza, Ambrose Slone, Ameet Rahane, Anantharaman S. Iyer, Anders Andreassen, Andrea Madotto, Andrea Santilli, Andreas Stuhlmüller, Andrew Dai, Andrew La, Andrew Lampinen, Andy Zou, Angela Jiang, Angelica Chen, Anh Vuong, Animesh Gupta, Anna Gottardi, Antonio Norelli, Anu Venkatesh, Arash Gholamidavoodi, Arfa Tabassum, Arul Menezes, Arun Kirubarajan, Asher Mullokandov, Ashish Sabharwal, Austin Herrick, Avia Efrat, Aykut Erdem, Ayla Karakaş, B. Ryan Roberts, Bao Sheng Loe, Barret Zoph, Bartłomiej Bojanowski, Batuhan Özyurt, Behnam Hedayatnia, Behnam Neyshabur, Benjamin Inden, Benno Stein, Berk Ekmekci, Bill Yuchen Lin, Blake Howald, Bryan Orinion, Cameron Diao, Cameron Dour, Catherine Stinson, Cedrick Argueta, César Ferri Ramírez, Chandan Singh, Charles Rathkopf, Chenlin Meng, Chitta Baral, Chiyu Wu, Chris Callison-Burch, Chris Waites, Christian Voigt, Christopher D. Manning, Christopher Potts, Cindy Ramirez, Clara E. Rivera, Clemencia Siro, Colin Raffel, Courtney Ashcraft, Cristina Garbacea, Damien Sileo, Dan Garrette, Dan Hendrycks, Dan Kilman, Dan Roth, Daniel Freeman, Daniel Khashabi, Daniel Levy, Daniel Moseguí González, Danielle Perszyk, Danny Hernandez, Danqi Chen, Daphne Ippolito, Dar Gilboa, David Dohan, David Drakard, David Jurgens, Debajyoti Datta, Deep Ganguli, Denis Emelin, Denis Kleyko, Deniz Yuret, Derek Chen, Derek Tam, Dieuwke Hupkes, Diganta Misra, Dilyar Buzan, Dimitri Coelho Mollo, Diyi Yang, Dong-Ho Lee, Dylan Schrader, Ekaterina Shutova, Ekin Dogus Cubuk, Elad Segal, Eleanor Hagerman, Elizabeth Barnes, Elizabeth Donoway, Ellie Pavlick, Emanuele Rodola, Emma Lam, Eric Chu, Eric Tang, Erkut Erdem, Ernie Chang, Ethan A. Chi, Ethan Dyer, Ethan Jerzak, Ethan Kim, Eunice Engefu Manyasi, Evgenii Zheltonozhskii, Fanyue Xia, Fatemeh Siar, Fernando Martínez-Plumed, Francesca Happé, Francois Chollet, Frieda Rong, Gaurav Mishra, Genta Indra Winata, Gerard de Melo, Germán Kruszewski, Giambattista Parascandolo, Giorgio Mariani, Gloria Wang, Gonzalo Jaimovitch-López, Gregor Betz, Guy Gur-Ari, Hana Galijasevic, Hannah Kim, Hannah Rashkin, Hannaneh Hajishirzi, Harsh Mehta, Hayden Bogar, Henry Shevlin, Hinrich Schütze, Hiromu Yakura, Hongming Zhang, Hugh Mee Wong, Ian Ng, Isaac Noble, Jaap Jumelet, Jack Geissinger, Jackson Kernion, Jacob Hilton, Jaehoon Lee, Jaime Fernández Fisac, James B. Simon, James Koppel, James Zheng, James Zou, Jan Kocoń, Jana Thompson, Janelle Wingfield, Jared Kaplan, Jarema Radom, Jascha Sohl-Dickstein, Jason Phang, Jason Wei, Jason Yosinski, Jekaterina Novikova, Jelle Bosscher, Jennifer Marsh, Jeremy Kim, Jeroen Taal, Jesse Engel, Jesujoba Alabi, Jiacheng Xu, Jiaming Song, Jillian Tang, Joan Waweru, John Burden, John Miller, John U. Balis, Jonathan Batchelder, Jonathan Berant, Jörg Frohberg, Jos Rozen, Jose Hernandez-Orallo, Joseph Boudeman, Joseph Guerr, Joseph Jones, Joshua B. Tenenbaum, Joshua S. Rule, Joyce Chua, Kamil Kanclerz, Karen Livescu, Karl Krauth, Karthik Gopalakrishnan, Katerina Ignatyeva, Katja Markert, Kaustubh D. Dhole, Kevin Gimpel, Kevin Omondi, Kory Mathewson, Kristen Chiafullo, Ksenia Shkaruta, Kumar Shridhar, Kyle McDonell, Kyle Richardson, Laria Reynolds, Leo Gao, Li Zhang, Liam Dugan, Lianhui Qin, Lidia Contreras-Ochando, Louis-Philippe Morency, Luca Moschella, Lucas Lam, Lucy Noble, Ludwig Schmidt, Luheng He, Luis Oliveros Colón, Luke Metz, Lütfi Kerem Şenel, Maarten Bosma, Maarten Sap, Maartje ter Hoeve, Maheen Farooqi, Manaal Faruqui, Mantas Mazeika, Marco Baturan, Marco Marelli, Marco Maru, Maria Jose Ramírez Quintana, Marie Tolkiehn, Mario Giulianelli, Martha Lewis, Martin Potthast, Matthew L. Leavitt, Matthias Hagen, Mátyás Schubert, Medina Orduna Baitemirova, Melody Arnaud, Melvin McElrath, Michael A. Yee, Michael Cohen, Michael Gu, Michael Ivanitskiy, Michael Starritt, Michael Strube, Michał Swędrowski, Michele Bevilacqua, Michihiro Yasunaga, Mihir Kale, Mike Cain, Mimee Xu, Mirac Suzgun, Mitch Walker, Mo Tiwari, Mohit Bansal, Moin Aminnaseri, Mor Geva, Mozhdeh Gheini, Mukund Varma T, Nanyun Peng, Nathan A. Chi, Nayeon Lee, Neta Gur-Ari Krakover, Nicholas Cameron, Nicholas Roberts, Nick Doiron, Nicole Martinez, Nikita Nangia, Niklas Deckers, Niklas Muennighoff, Nitish Shirish Keskar, Niveditha S. Iyer, Noah Constant, Noah Fiedel, Nuan Wen, Oliver Zhang, Omar Agha, Omar Elbaghdadi, Omer Levy, Owain Evans, Pablo Antonio Moreno Casares, Parth Doshi, Pascale Fung, Paul Pu Liang, Paul Vicol, Pegah Alipoormolabashi, Peiyuan Liao, Percy Liang, Peter Chang, Peter Eckersley, Phu Mon Htut, Pinyu Hwang, Piotr Miłkowski, Piyush Patil, Pouya Pezeshkpour, Priti Oli, Qiaozhu Mei, Qing Lyu, Qinlang Chen, Rabin Banjade, Rachel Etta Rudolph, Raefer Gabriel, Rahel Habacker, Ramon Risco, Raphaël Millière, Rhythm Garg, Richard Barnes, Rif A. Saurous, Riku Arakawa, Robbe Raymaekers, Robert Frank, Rohan Sikand, Roman Novak, Roman Sitelew, Ronan LeBras, Rosanne Liu, Rowan Jacobs, Rui Zhang, Ruslan Salakhutdinov, Ryan Chi, Ryan Lee, Ryan Stovall, Ryan Teehan, Rylan Yang, Sahib Singh, Saif M. Mohammad, Sajant Anand, Sam Dillavou, Sam Shleifer, Sam Wiseman, Samuel Gruetter, Samuel R. Bowman, Samuel S. Schoenholz, Sanghyun Han, Sanjeev Kwatra, Sarah A. Rous, Sarik Ghazarian, Sayan Ghosh, Sean Casey, Sebastian Bischoff, Sebastian Gehrmann, Sebastian Schuster, Sepideh Sadeghi, Shadi Hamdan, Sharon Zhou, Shashank Srivastava, Sherry Shi, Shikhar Singh, Shima Asaadi, Shixiang Shane Gu, Shubh Pachchigar, Shubham Toshniwal, Shyam Upadhyay, Shyamolima, Debnath, Siamak Shakeri, Simon Thormeyer, Simone Melzi, Siva Reddy, Sneha Priscilla Makini, Soo-Hwan Lee, Spencer Torene, Sriharsha Hatwar, Stanislas Dehaene, Stefan Divic, Stefano Ermon, Stella Biderman, Stephanie Lin, Stephen Prasad, Steven T. Piantadosi, Stuart M. Shieber, Summer Misherghi, Svetlana Kiritchenko, Swaroop Mishra, Tal Linzen, Tal Schuster, Tao Li, Tao Yu, Tariq Ali, Tatsu Hashimoto, Te-Lin Wu, Théo Desbordes, Theodore Rothschild, Thomas Phan, Tianle Wang, Tiberius Nkinyili, Timo Schick, Timofei Kornev, Titus Tunduny, Tobias Gerstenberg, Trenton Chang, Trishala Neeraj, Tushar Khot, Tyler Shultz, Uri Shaham, Vedant Misra, Vera Demberg, Victoria Nyamai, Vikas Raunak, Vinay Ramasesh, Vinay Uday Prabhu, Vishakh Padmakumar, Vivek Srikumar, William Fedus, William Saunders, William Zhang, Wout Vossen, Xiang Ren, Xiaoyu Tong, Xinran Zhao, Xinyi Wu, Xudong Shen, Yadollah Yaghoobzadeh, Yair Lakretz, Yangqiu Song, Yasaman Bahri, Yejin Choi, Yichi Yang, Yiding Hao, Yifu Chen, Yonatan Belinkov, Yu Hou, Yufang Hou, Yuntao Bai, Zachary Seid, Zhuoye Zhao, Zijian Wang, Zijie J. Wang, ZiRui Wang, Ziyi Wu

BIG-bench focuses on tasks that are believed to be beyond the capabilities of current language models.

Common Sense Reasoning Math +1

2,647

Paper
Code

Brainish: Formalizing A Multimodal Language for Intelligence and Consciousness

no code implementations • 14 Apr 2022 • Paul Pu Liang

Through discussing how Brainish is crucial for communication and coordination in order to achieve consciousness in the CTM, and by implementing a simple version of Brainish and evaluating its capability of demonstrating intelligence on multimodal prediction and retrieval tasks on several real-world image, text, and audio datasets, we argue that such an inner language will be important for advances in machine models of intelligence and consciousness.

Retrieval Translation

Paper
Add Code

PACS: A Dataset for Physical Audiovisual CommonSense Reasoning

1 code implementation • 21 Mar 2022 • Samuel Yu, Peter Wu, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency

Our paper takes a step towards real-world physical commonsense reasoning by contributing PACS: the first audiovisual benchmark annotated for physical commonsense attributes.

Ranked #1 on Physical Commonsense Reasoning on Physical Audiovisual CommonSense

Multimodal Reasoning Physical Commonsense Reasoning

Paper
Code

DIME: Fine-grained Interpretations of Multimodal Models via Disentangled Local Explanations

1 code implementation • 3 Mar 2022 • Yiwei Lyu, Paul Pu Liang, Zihao Deng, Ruslan Salakhutdinov, Louis-Philippe Morency

The ability for a human to understand an Artificial Intelligence (AI) model's decision-making process is critical in enabling stakeholders to visualize model behavior, perform model debugging, promote trust in AI models, and assist in collaborative human-AI decision-making.

Decision Making Disentanglement +2

Paper
Code

High-Modality Multimodal Transformer: Quantifying Modality & Interaction Heterogeneity for High-Modality Representation Learning

1 code implementation • 2 Mar 2022 • Paul Pu Liang, Yiwei Lyu, Xiang Fan, Jeffrey Tsaw, Yudong Liu, Shentong Mo, Dani Yogatama, Louis-Philippe Morency, Ruslan Salakhutdinov

Many real-world problems are inherently multimodal, from spoken language, gestures, and paralinguistics humans use to communicate, to force, proprioception, and visual sensors on robots.

Representation Learning Time Series Analysis +2

Paper
Code

MultiBench: Multiscale Benchmarks for Multimodal Representation Learning

2 code implementations • 15 Jul 2021 • Paul Pu Liang, Yiwei Lyu, Xiang Fan, Zetian Wu, Yun Cheng, Jason Wu, Leslie Chen, Peter Wu, Michelle A. Lee, Yuke Zhu, Ruslan Salakhutdinov, Louis-Philippe Morency

In order to accelerate progress towards understudied modalities and tasks while ensuring real-world robustness, we release MultiBench, a systematic and unified large-scale benchmark spanning 15 datasets, 10 modalities, 20 prediction tasks, and 6 research areas.

Representation Learning

5,383

Paper
Code

Learning Language and Multimodal Privacy-Preserving Markers of Mood from Mobile Data

no code implementations • ACL 2021 • Paul Pu Liang, Terrance Liu, Anna Cai, Michal Muszynski, Ryo Ishii, Nicholas Allen, Randy Auerbach, David Brent, Ruslan Salakhutdinov, Louis-Philippe Morency

Using computational models, we find that language and multimodal representations of mobile typed text (spanning typed characters, words, keystroke timings, and app usage) are predictive of daily mood.

Privacy Preserving

Paper
Add Code

Towards Understanding and Mitigating Social Biases in Language Models

1 code implementation • 24 Jun 2021 • Paul Pu Liang, Chiyu Wu, Louis-Philippe Morency, Ruslan Salakhutdinov

As machine learning methods are deployed in real-world settings such as healthcare, legal systems, and social science, it is crucial to recognize how they shape social biases and stereotypes in these sensitive decision-making processes.

Decision Making Fairness +1

Paper
Code

Rethinking Architecture Design for Tackling Data Heterogeneity in Federated Learning

1 code implementation • CVPR 2022 • Liangqiong Qu, Yuyin Zhou, Paul Pu Liang, Yingda Xia, Feifei Wang, Ehsan Adeli, Li Fei-Fei, Daniel Rubin

Federated learning is an emerging research paradigm enabling collaborative training of machine learning models among different organizations while keeping data private at each institution.

Federated Learning

Paper
Code

Conditional Contrastive Learning for Improving Fairness in Self-Supervised Learning

no code implementations • 5 Jun 2021 • Martin Q. Ma, Yao-Hung Hubert Tsai, Paul Pu Liang, Han Zhao, Kun Zhang, Ruslan Salakhutdinov, Louis-Philippe Morency

In this paper, we propose a Conditional Contrastive Learning (CCL) approach to improve the fairness of contrastive SSL methods.

Attribute Contrastive Learning +3

Paper
Add Code

Ask & Explore: Grounded Question Answering for Curiosity-Driven Exploration

no code implementations • 24 Apr 2021 • Jivat Neet Kaur, Yiding Jiang, Paul Pu Liang

In many real-world scenarios where extrinsic rewards to the agent are extremely sparse, curiosity has emerged as a useful concept providing intrinsic rewards that enable the agent to explore its environment and acquire information to achieve its goals.

Question Answering

Paper
Add Code

StylePTB: A Compositional Benchmark for Fine-grained Controllable Text Style Transfer

2 code implementations • NAACL 2021 • Yiwei Lyu, Paul Pu Liang, Hai Pham, Eduard Hovy, Barnabás Póczos, Ruslan Salakhutdinov, Louis-Philippe Morency

Many of the existing style transfer benchmarks primarily focus on individual high-level semantic changes (e. g. positive to negative), which enable controllability at a high level but do not offer fine-grained control involving sentence structure, emphasis, and content of the sentence.

Benchmarking Sentence +2

Paper
Code

Understanding the Tradeoffs in Client-side Privacy for Downstream Speech Tasks

2 code implementations • 22 Jan 2021 • Peter Wu, Paul Pu Liang, Jiatong Shi, Ruslan Salakhutdinov, Shinji Watanabe, Louis-Philippe Morency

As users increasingly rely on cloud-based computing services, it is important to ensure that uploaded speech data remains private.

Representation Learning speech-recognition +1

Paper
Code

Multimodal Privacy-preserving Mood Prediction from Mobile Data: A Preliminary Study

no code implementations • 4 Dec 2020 • Terrance Liu, Paul Pu Liang, Michal Muszynski, Ryo Ishii, David Brent, Randy Auerbach, Nicholas Allen, Louis-Philippe Morency

Mental health conditions remain under-diagnosed even in countries with common access to advanced medical care.

Privacy Preserving

Paper
Add Code

Cross-Modal Generalization: Learning in Low Resource Modalities via Meta-Alignment

1 code implementation • 4 Dec 2020 • Paul Pu Liang, Peter Wu, Liu Ziyin, Louis-Philippe Morency, Ruslan Salakhutdinov

In this work, we propose algorithms for cross-modal generalization: a learning paradigm to train a model that can (1) quickly perform new tasks in a target modality (i. e. meta-learning) and (2) doing so while being trained on a different source modality.

Meta-Learning

Paper
Code

An Investigation of how Label Smoothing Affects Generalization

no code implementations • 23 Oct 2020 • Blair Chen, Liu Ziyin, ZiHao Wang, Paul Pu Liang

In this paper, as a step towards understanding why label smoothing is effective, we propose a theoretical framework to show how label smoothing provides in controlling the generalization loss.

Paper
Add Code

Towards Debiasing Sentence Representations

1 code implementation • ACL 2020 • Paul Pu Liang, Irene Mengze Li, Emily Zheng, Yao Chong Lim, Ruslan Salakhutdinov, Louis-Philippe Morency

As natural language processing methods are increasingly deployed in real-world scenarios such as healthcare, legal systems, and social science, it becomes necessary to recognize the role they potentially play in shaping social biases and stereotypes.

Linguistic Acceptability Natural Language Understanding +3

Paper
Code

Anchor & Transform: Learning Sparse Embeddings for Large Vocabularies

no code implementations • ICLR 2021 • Paul Pu Liang, Manzil Zaheer, Yu-An Wang, Amr Ahmed

In this paper, we design a simple and efficient embedding algorithm that learns a small set of anchor embeddings and a sparse transformation matrix.

Language Modelling Movie Recommendation +2

Paper
Add Code

Diverse and Admissible Trajectory Forecasting through Multimodal Context Understanding

1 code implementation • 6 Mar 2020 • Seong Hyeon Park, Gyubok Lee, Manoj Bhat, Jimin Seo, Minseok Kang, Jonathan Francis, Ashwin R. Jadhav, Paul Pu Liang, Louis-Philippe Morency

Multi-agent trajectory forecasting in autonomous driving requires an agent to accurately anticipate the behaviors of the surrounding vehicles and pedestrians, for safe and reliable decision-making.

Autonomous Driving Decision Making +1

Paper
Code

On Emergent Communication in Competitive Multi-Agent Teams

1 code implementation • 4 Mar 2020 • Paul Pu Liang, Jeffrey Chen, Ruslan Salakhutdinov, Louis-Philippe Morency, Satwik Kottur

Several recent works have found the emergence of grounded compositional language in the communication protocols developed by mostly cooperative multi-agent systems when learned end-to-end to maximize performance on a downstream task.

Paper
Code

Learning Not to Learn in the Presence of Noisy Labels

no code implementations • 16 Feb 2020 • Liu Ziyin, Blair Chen, Ru Wang, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency, Masahito Ueda

Learning in the presence of label noise is a challenging yet important task: it is crucial to design models that are robust in the presence of mislabeled datasets.

Memorization text-classification +1

Paper
Add Code

Think Locally, Act Globally: Federated Learning with Local and Global Representations

4 code implementations • 6 Jan 2020 • Paul Pu Liang, Terrance Liu, Liu Ziyin, Nicholas B. Allen, Randy P. Auerbach, David Brent, Ruslan Salakhutdinov, Louis-Philippe Morency

To this end, we propose a new federated learning algorithm that jointly learns compact local representations on each device and a global model across all devices.

Federated Learning Representation Learning +2

1,136

Paper
Code

Factorized Multimodal Transformer for Multimodal Sequential Learning

no code implementations • 22 Nov 2019 • Amir Zadeh, Chengfeng Mao, Kelly Shi, Yiwei Zhang, Paul Pu Liang, Soujanya Poria, Louis-Philippe Morency

As machine learning leaps towards better generalization to real world, multimodal sequential learning becomes a fundamental research area.

Paper
Add Code

Anchor & Transform: Learning Sparse Representations of Discrete Objects

no code implementations • 25 Sep 2019 • Paul Pu Liang, Manzil Zaheer, YuAn Wang, Amr Ahmed

Learning continuous representations of discrete objects such as text, users, and items lies at the heart of many applications including text and user modeling.

Language Modelling text-classification +1

Paper
Add Code

A Simple Approach to the Noisy Label Problem Through the Gambler's Loss

no code implementations • 25 Sep 2019 • Liu Ziyin, Ru Wang, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency, Masahito Ueda

Learning in the presence of label noise is a challenging yet important task.

Memorization

Paper
Add Code

Learning Representations from Imperfect Time Series Data via Tensor Rank Regularization

no code implementations • ACL 2019 • Paul Pu Liang, Zhun Liu, Yao-Hung Hubert Tsai, Qibin Zhao, Ruslan Salakhutdinov, Louis-Philippe Morency

Our method is based on the observation that high-dimensional multimodal time series data often exhibit correlations across time and modalities which leads to low-rank tensor representations.

Question Answering Sentiment Analysis +4

Paper
Add Code

Deep Gamblers: Learning to Abstain with Portfolio Theory

3 code implementations • NeurIPS 2019 • Liu Ziyin, Zhikang Wang, Paul Pu Liang, Ruslan Salakhutdinov, Louis-Philippe Morency, Masahito Ueda

We deal with the \textit{selective classification} problem (supervised-learning problem with a rejection option), where we want to achieve the best performance at a certain level of coverage of the data.

Classification General Classification

Paper
Code

Multimodal Transformer for Unaligned Multimodal Language Sequences

4 code implementations • ACL 2019 • Yao-Hung Hubert Tsai, Shaojie Bai, Paul Pu Liang, J. Zico Kolter, Louis-Philippe Morency, Ruslan Salakhutdinov

Human language is often multimodal, which comprehends a mixture of natural language, facial gestures, and acoustic behaviors.

Ranked #5 on Multimodal Sentiment Analysis on MOSI

Multimodal Sentiment Analysis Time Series +1

751

Paper
Code

Strong and Simple Baselines for Multimodal Utterance Embeddings

1 code implementation • NAACL 2019 • Paul Pu Liang, Yao Chong Lim, Yao-Hung Hubert Tsai, Ruslan Salakhutdinov, Louis-Philippe Morency

Human language is a rich multimodal signal consisting of spoken words, facial expressions, body gestures, and vocal intonations.

Benchmarking

Paper
Code

Variational Auto-Decoder: A Method for Neural Generative Modeling from Incomplete Data

no code implementations • 3 Mar 2019 • Amir Zadeh, Yao-Chong Lim, Paul Pu Liang, Louis-Philippe Morency

We study a specific implementation of the Auto-Encoding Variational Bayes (AEVB) algorithm, named in this paper as a Variational Auto-Decoder (VAD).

Paper
Add Code

Found in Translation: Learning Robust Joint Representations by Cyclic Translations Between Modalities

2 code implementations • 19 Dec 2018 • Hai Pham, Paul Pu Liang, Thomas Manzini, Louis-Philippe Morency, Barnabas Poczos

Our method is based on the key insight that translation from a source to a target modality provides a method of learning joint representations using only the source modality as input.

Machine Translation Multimodal Sentiment Analysis +1

106

Paper
Code

An Empirical Evaluation of Sketched SVD and its Application to Leverage Score Ordering

no code implementations • 19 Dec 2018 • Hui Han Chin, Paul Pu Liang

We provide a comprehensive empirical evaluation of these algorithms and provide guidelines on how to ensure accurate deployment to real-world data.

Image Classification Sentiment Analysis

Paper
Add Code

Words Can Shift: Dynamically Adjusting Word Representations Using Nonverbal Behaviors

4 code implementations • 23 Nov 2018 • Yansen Wang, Ying Shen, Zhun Liu, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency

Humans convey their intentions through the usage of both verbal and nonverbal behaviors during face-to-face communication.

Emotion Recognition Multimodal Sentiment Analysis

Paper
Code

Multimodal Language Analysis with Recurrent Multistage Fusion

1 code implementation • EMNLP 2018 • Paul Pu Liang, Ziyin Liu, Amir Zadeh, Louis-Philippe Morency

In this paper, we propose the Recurrent Multistage Fusion Network (RMFN) which decomposes the fusion problem into multiple stages, each of them focused on a subset of multimodal signals for specialized, effective fusion.

Emotion Recognition Multimodal Sentiment Analysis

Paper
Code

Seq2Seq2Sentiment: Multimodal Sequence to Sequence Models for Sentiment Analysis

no code implementations • WS 2018 • Hai Pham, Thomas Manzini, Paul Pu Liang, Barnabas Poczos

Multimodal machine learning is a core research area spanning the language, visual and acoustic modalities.

Multimodal Sentiment Analysis Translation

Paper
Add Code

Multimodal Language Analysis in the Wild: CMU-MOSEI Dataset and Interpretable Dynamic Fusion Graph

no code implementations • ACL 2018 • AmirAli Bagher Zadeh, Paul Pu Liang, Soujanya Poria, Erik Cambria, Louis-Philippe Morency

Analyzing human multimodal language is an emerging area of research in NLP.

Ranked #11 on Multimodal Sentiment Analysis on CMU-MOSEI (using extra training data)

Emotion Recognition Language Modelling +2

Paper
Add Code

Learning Factorized Multimodal Representations

2 code implementations • ICLR 2019 • Yao-Hung Hubert Tsai, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency, Ruslan Salakhutdinov

Multimodal discriminative factors are shared across all modalities and contain joint multimodal features required for discriminative tasks such as sentiment prediction.

Representation Learning

Paper
Code

Efficient Low-rank Multimodal Fusion with Modality-Specific Factors

3 code implementations • ACL 2018 • Zhun Liu, Ying Shen, Varun Bharadhwaj Lakshminarasimhan, Paul Pu Liang, Amir Zadeh, Louis-Philippe Morency

Previous research in this field has exploited the expressiveness of tensors for multimodal representation.

Emotion Recognition Multimodal Sentiment Analysis

234

Paper
Code

Multi-attention Recurrent Network for Human Communication Comprehension

2 code implementations • 3 Feb 2018 • Amir Zadeh, Paul Pu Liang, Soujanya Poria, Prateek Vij, Erik Cambria, Louis-Philippe Morency

AI must understand each modality and the interactions between them that shape human communication.

Ranked #9 on Multimodal Sentiment Analysis on MOSI

Emotion Recognition Multimodal Sentiment Analysis

106

Paper
Code

Multimodal Sentiment Analysis with Word-Level Fusion and Reinforcement Learning

2 code implementations • 3 Feb 2018 • Minghai Chen, Sen Wang, Paul Pu Liang, Tadas Baltrušaitis, Amir Zadeh, Louis-Philippe Morency

In this paper, we propose the Gated Multimodal Embedding LSTM with Temporal Attention (GME-LSTM(A)) model that is composed of 2 modules.

Multimodal Sentiment Analysis reinforcement-learning +3

106

Paper
Code

Memory Fusion Network for Multi-view Sequential Learning

2 code implementations • 3 Feb 2018 • Amir Zadeh, Paul Pu Liang, Navonil Mazumder, Soujanya Poria, Erik Cambria, Louis-Philippe Morency

In this paper, we present a new neural architecture for multi-view sequential learning called the Memory Fusion Network (MFN) that explicitly accounts for both interactions in a neural architecture and continuously models them through time.

106

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.