Search Results for author: Rada Mihalcea

Found 179 papers, 61 papers with code

Counseling-Style Reflection Generation Using Generative Pretrained Transformers with Augmented Context

no code implementations SIGDIAL (ACL) 2020 Siqi Shen, Charles Welch, Rada Mihalcea, Verónica Pérez-Rosas

We introduce a counseling dialogue system that seeks to assist counselors while they are learning and refining their counseling skills.

In-the-Wild Video Question Answering

no code implementations COLING 2022 Santiago Castro, Naihao Deng, Pingxuan Huang, Mihai Burzo, Rada Mihalcea

Existing video understanding datasets mostly focus on human interactions, with little attention being paid to the “in the wild” settings, where the videos are recorded outdoors.

Evidence Selection Question Answering +2

Knowledge Enhanced Reflection Generation for Counseling Dialogues

no code implementations ACL 2022 Siqi Shen, Veronica Perez-Rosas, Charles Welch, Soujanya Poria, Rada Mihalcea

We propose a pipeline that collects domain knowledge through web mining, and show that retrieval from both domain-specific and commonsense knowledge bases improves the quality of generated responses.

Retrieval

Text-Aware Graph Embeddings for Donation Behavior Prediction

no code implementations COLING (TextGraphs) 2022 MeiXing Dong, Xueming Xu, Rada Mihalcea

Predicting user behavior is essential for a large number of applications including recommender and dialog systems, and more broadly in domains such as healthcare, education, and economics.

Analyzing the Effects of Annotator Gender across NLP Tasks

1 code implementation NLPerspectives (LREC) 2022 Laura Biester, Vanita Sharma, Ashkan Kazemi, Naihao Deng, Steven Wilson, Rada Mihalcea

Recent studies have shown that for subjective annotation tasks, the demographics, lived experiences, and identity of annotators can have a large impact on how items are labeled.

Natural Language Inference

Modality-specific Learning Rates for Effective Multimodal Additive Late-fusion

no code implementations Findings (ACL) 2022 Yiqun Yao, Rada Mihalcea

Moreover, for different modalities, the best unimodal models may work under significantly different learning rates due to the nature of the modality and the computational flow of the model; thus, selecting a global learning rate for late-fusion models can result in a vanishing gradient for some modalities.

Open-Ended Question Answering

Implicit Personalization in Language Models: A Systematic Study

no code implementations23 May 2024 Zhijing Jin, Nils Heil, Jiarui Liu, Shehzaad Dhuliawala, Yahang Qi, Bernhard Schölkopf, Rada Mihalcea, Mrinmaya Sachan

This work systematically studies IP through a rigorous mathematical formulation, a multi-perspective moral reasoning framework, and a set of case studies.

Philosophy

Understanding the Capabilities and Limitations of Large Language Models for Cultural Commonsense

no code implementations7 May 2024 Siqi Shen, Lajanugen Logeswaran, Moontae Lee, Honglak Lee, Soujanya Poria, Rada Mihalcea

Large language models (LLMs) have demonstrated substantial commonsense understanding through numerous benchmark evaluations.

Towards Dog Bark Decoding: Leveraging Human Speech Processing for Automated Bark Classification

no code implementations29 Apr 2024 Artem Abzaliev, Humberto Pérez Espinosa, Rada Mihalcea

In this paper, we address dog vocalizations and explore the use of self-supervised speech representation models pre-trained on human speech to address dog bark classification tasks that find parallels in human-centered tasks in speech recognition.

Classification Gender Classification +2

Cooperate or Collapse: Emergence of Sustainability Behaviors in a Society of LLM Agents

no code implementations25 Apr 2024 Giorgio Piatti, Zhijing Jin, Max Kleiman-Weiner, Bernhard Schölkopf, Mrinmaya Sachan, Rada Mihalcea

Through this simulation environment, we explore the dynamics of resource sharing among AI agents, highlighting the importance of ethical considerations, strategic planning, and negotiation skills.

Decision Making

Cross-cultural Inspiration Detection and Analysis in Real and LLM-generated Social Media Data

1 code implementation19 Apr 2024 Oana Ignat, Gayathri Ganesh Lakshmy, Rada Mihalcea

To this end, we compile and make publicly available the InspAIred dataset, which consists of 2, 000 real inspiring posts, 2, 000 real non-inspiring posts, and 2, 000 generated inspiring posts evenly distributed across India and the UK.

MAiDE-up: Multilingual Deception Detection of GPT-generated Hotel Reviews

no code implementations19 Apr 2024 Oana Ignat, Xiaomeng Xu, Rada Mihalcea

Using this dataset, we conduct extensive linguistic analyses to (1) compare the AI fake hotel reviews to real hotel reviews, and (2) identify the factors that influence the deception detection model performance.

Deception Detection

Tango 2: Aligning Diffusion-based Text-to-Audio Generations through Direct Preference Optimization

1 code implementation15 Apr 2024 Navonil Majumder, Chia-Yu Hung, Deepanway Ghosal, Wei-Ning Hsu, Rada Mihalcea, Soujanya Poria

These models do not explicitly focus on the presence of concepts or events and their temporal ordering in the output audio with respect to the input prompt.

Audio Generation

The Generation Gap:Exploring Age Bias in the Underlying Value Systems of Large Language Models

no code implementations12 Apr 2024 Siyang Liu, Trish Maturi, Bowen Yi, Siqi Shen, Rada Mihalcea

In this paper, we explore the alignment of values in Large Language Models (LLMs) with specific age groups, leveraging data from the World Value Survey across thirteen categories.

Towards Algorithmic Fidelity: Mental Health Representation across Demographics in Synthetic vs. Human-generated Data

1 code implementation25 Mar 2024 Shinka Mori, Oana Ignat, Andrew Lee, Rada Mihalcea

Using GPT-3, we develop HEADROOM, a synthetic dataset of 3, 120 posts about depression-triggering stressors, by controlling for race, gender, and time frame (before and after COVID-19).

Synthetic Data Generation

Dynamic Reward Adjustment in Multi-Reward Reinforcement Learning for Counselor Reflection Generation

1 code implementation20 Mar 2024 Do June Min, Veronica Perez-Rosas, Kenneth Resnicow, Rada Mihalcea

In this paper, we study the problem of multi-reward reinforcement learning to jointly optimize for multiple text qualities for natural language generation.

Text Generation

Annotations on a Budget: Leveraging Geo-Data Similarity to Balance Model Performance and Annotation Cost

1 code implementation12 Mar 2024 Oana Ignat, Longju Bai, Joan Nwatu, Rada Mihalcea

In this paper, we propose methods to identify the data to be annotated to balance model performance and annotation costs.

SQL-CRAFT: Text-to-SQL through Interactive Refinement and Enhanced Reasoning

no code implementations20 Feb 2024 Hanchen Xia, Feng Jiang, Naihao Deng, Cunxiang Wang, Guojiang Zhao, Rada Mihalcea, Yue Zhang

Modern LLMs have become increasingly powerful, but they are still facing challenges in specialized tasks such as Text-to-SQL.

Text-To-SQL

Tables as Images? Exploring the Strengths and Limitations of LLMs on Multimodal Representations of Tabular Data

no code implementations19 Feb 2024 Naihao Deng, Zhenjie Sun, Ruiqi He, Aman Sikka, Yulong Chen, Lin Ma, Yue Zhang, Rada Mihalcea

In this paper, we investigate the effectiveness of various LLMs in interpreting tabular data through different prompting strategies and data formats.

Fact Checking Question Answering

Caught in the Quicksand of Reasoning, Far from AGI Summit: Evaluating LLMs' Mathematical and Coding Competency through Ontology-guided Interventions

1 code implementation17 Jan 2024 Pengfei Hong, Deepanway Ghosal, Navonil Majumder, Somak Aditya, Rada Mihalcea, Soujanya Poria

Recent advancements in Large Language Models (LLMs) have showcased striking results on existing logical reasoning benchmarks, with some models even surpassing human performance.

Arithmetic Reasoning Code Generation +3

Whose wife is it anyway? Assessing bias against same-gender relationships in machine translation

no code implementations10 Jan 2024 Ian Stewart, Rada Mihalcea

Machine translation often suffers from biased data and algorithms that can lead to unacceptable errors in system output.

Machine Translation

A Mechanistic Understanding of Alignment Algorithms: A Case Study on DPO and Toxicity

1 code implementation3 Jan 2024 Andrew Lee, Xiaoyan Bai, Itamar Pres, Martin Wattenberg, Jonathan K. Kummerfeld, Rada Mihalcea

While alignment algorithms are now commonly used to tune pre-trained language models towards a user's preferences, we lack explanations for the underlying mechanisms in which models become ``aligned'', thus making it difficult to explain phenomena like jailbreaks.

Language Modelling

VERVE: Template-based ReflectiVE Rewriting for MotiVational IntErviewing

no code implementations14 Nov 2023 Do June Min, Verónica Pérez-Rosas, Kenneth Resnicow, Rada Mihalcea

We introduce VERVE, a template-based rewriting system with paraphrase-augmented training and adaptive template updating.

Bridging the Digital Divide: Performance Variation across Socio-Economic Factors in Vision-Language Models

1 code implementation9 Nov 2023 Joan Nwatu, Oana Ignat, Rada Mihalcea

Despite the impressive performance of current AI models reported across various tasks, performance reports often do not include evaluations of how these models perform on the specific groups that will be impacted by these technologies.

Language Modelling

Task-Adaptive Tokenization: Enhancing Long-Form Text Generation Efficacy in Mental Health and Beyond

no code implementations9 Oct 2023 Siyang Liu, Naihao Deng, Sahand Sabour, Yilin Jia, Minlie Huang, Rada Mihalcea

We propose task-adaptive tokenization as a way to adapt the generation pipeline to the specifics of a downstream task and enhance long-form generation in mental health.

Question Answering Text Generation

Human Action Co-occurrence in Lifestyle Vlogs using Graph Link Prediction

1 code implementation12 Sep 2023 Oana Ignat, Santiago Castro, Weiji Li, Rada Mihalcea

We create and make publicly available the ACE (Action Co-occurrencE) dataset, consisting of a large graph of ~12k co-occurring pairs of visual actions and their corresponding video clips.

Link Prediction

Misinformation as Information Pollution

no code implementations21 Jun 2023 Ashkan Kazemi, Rada Mihalcea

Social media feed algorithms are designed to optimize online social engagements for the purpose of maximizing advertising profits, and therefore have an incentive to promote controversial posts including misinformation.

Misinformation

Can Large Language Models Infer Causation from Correlation?

1 code implementation9 Jun 2023 Zhijing Jin, Jiarui Liu, Zhiheng Lyu, Spencer Poff, Mrinmaya Sachan, Rada Mihalcea, Mona Diab, Bernhard Schölkopf

In this work, we propose the first benchmark dataset to test the pure causal inference skills of large language models (LLMs).

Causal Inference

Scalable Performance Analysis for Vision-Language Models

1 code implementation30 May 2023 Santiago Castro, Oana Ignat, Rada Mihalcea

Joint vision-language models have shown great performance over a diverse set of tasks.

Voices of Her: Analyzing Gender Differences in the AI Publication World

1 code implementation24 May 2023 Yiwen Ding, Jiarui Liu, Zhiheng Lyu, Kun Zhang, Bernhard Schoelkopf, Zhijing Jin, Rada Mihalcea

While several previous studies have analyzed gender bias in research, we are still missing a comprehensive analysis of gender differences in the AI community, covering diverse topics and different development trends.

EASE: An Easily-Customized Annotation System Powered by Efficiency Enhancement Mechanisms

no code implementations23 May 2023 Naihao Deng, YiKai Liu, Mingye Chen, Winston Wu, Siyang Liu, Yulong Chen, Yue Zhang, Rada Mihalcea

Our results show that our system can meet the diverse needs of NLP researchers and significantly accelerate the annotation process.

Active Learning

Beyond Good Intentions: Reporting the Research Landscape of NLP for Social Good

1 code implementation9 May 2023 Fernando Gonzalez, Zhijing Jin, Bernhard Schölkopf, Tom Hope, Mrinmaya Sachan, Rada Mihalcea

Using state-of-the-art NLP models, we address each of these tasks and use them on the entire ACL Anthology, resulting in a visualization workspace that gives researchers a comprehensive overview of the field of NLP4SG.

Psychologically-Inspired Causal Prompts

1 code implementation2 May 2023 Zhiheng Lyu, Zhijing Jin, Justus Mattern, Rada Mihalcea, Mrinmaya Sachan, Bernhard Schoelkopf

In this work, we take sentiment classification as an example and look into the causal relations between the review (X) and sentiment (Y).

Sentiment Analysis Sentiment Classification

A Review of Deep Learning Techniques for Speech Processing

no code implementations30 Apr 2023 Ambuj Mehrish, Navonil Majumder, Rishabh Bhardwaj, Rada Mihalcea, Soujanya Poria

The power of deep learning techniques has opened up new avenues for research and innovation in the field of speech processing, with far-reaching implications for a range of industries and applications.

Automatic Speech Recognition Emotion Recognition +4

Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech Understanding

1 code implementation2 Mar 2023 Yingting Li, Ambuj Mehrish, Shuai Zhao, Rishabh Bhardwaj, Amir Zadeh, Navonil Majumder, Rada Mihalcea, Soujanya Poria

To mitigate this issue, parameter-efficient transfer learning algorithms, such as adapters and prefix tuning, have been proposed as a way to introduce a few trainable parameters that can be plugged into large pre-trained language models such as BERT, and HuBERT.

Speech Synthesis Transfer Learning

Natural Language Processing for Policymaking

no code implementations7 Feb 2023 Zhijing Jin, Rada Mihalcea

This text is from Chapter 7 (pages 141-162) of the Handbook of Computational Social Science for Policy (2023).

Event Extraction text-classification +1

Understanding Stereotypes in Language Models: Towards Robust Measurement and Zero-Shot Debiasing

no code implementations20 Dec 2022 Justus Mattern, Zhijing Jin, Mrinmaya Sachan, Rada Mihalcea, Bernhard Schölkopf

Generated texts from large pretrained language models have been shown to exhibit a variety of harmful, human-like biases about various demographics.

Benchmarking

Editing a Woman's Voice

1 code implementation5 Dec 2022 Anna Costello, Ekaterina Fedorova, Zhijing Jin, Rada Mihalcea

However, when we trace those early drafts to their published versions, a substantial gender gap in linguistic uncertainty arises.

Two is Better than Many? Binary Classification as an Effective Approach to Multi-Choice Question Answering

1 code implementation29 Oct 2022 Deepanway Ghosal, Navonil Majumder, Rada Mihalcea, Soujanya Poria

We show the efficacy of our proposed approach in different tasks -- abductive reasoning, commonsense question answering, science question answering, and sentence completion.

Binary Classification Science Question Answering +2

Query Rewriting for Effective Misinformation Discovery

no code implementations14 Oct 2022 Ashkan Kazemi, Artem Abzaliev, Naihao Deng, Rui Hou, Scott A. Hale, Verónica Pérez-Rosas, Rada Mihalcea

We propose a novel system to help fact-checkers formulate search queries for known misinformation claims and effectively search across multiple social media platforms.

Misinformation reinforcement-learning +2

Multiview Contextual Commonsense Inference: A New Dataset and Task

1 code implementation6 Oct 2022 Siqi Shen, Deepanway Ghosal, Navonil Majumder, Henry Lim, Rada Mihalcea, Soujanya Poria

Our results show that the proposed pre-training objectives are effective at adapting the pre-trained T5-Large model for the contextual commonsense inference task.

 Ranked #1 on Multiview Contextual Commonsense Inference on CICERO (using extra training data)

Multiview Contextual Commonsense Inference

When to Make Exceptions: Exploring Language Models as Accounts of Human Moral Judgment

1 code implementation4 Oct 2022 Zhijing Jin, Sydney Levine, Fernando Gonzalez, Ojasv Kamal, Maarten Sap, Mrinmaya Sachan, Rada Mihalcea, Josh Tenenbaum, Bernhard Schölkopf

Using a state-of-the-art large language model (LLM) as a basis, we propose a novel moral chain of thought (MORALCOT) prompting strategy that combines the strengths of LLMs with theories of moral reasoning developed in cognitive science to predict human moral judgments.

Language Modelling Large Language Model +1

WildQA: In-the-Wild Video Question Answering

no code implementations14 Sep 2022 Santiago Castro, Naihao Deng, Pingxuan Huang, Mihai Burzo, Rada Mihalcea

Existing video understanding datasets mostly focus on human interactions, with little attention being paid to the "in the wild" settings, where the videos are recorded outdoors.

Evidence Selection Question Answering +2

We Are in This Together: Quantifying Community Subjective Wellbeing and Resilience

no code implementations23 Aug 2022 MeiXing Dong, Ruixuan Sun, Laura Biester, Rada Mihalcea

Notably, we find that communities that talked more about social ties normally experienced in-person, such as friends, family, and affiliations, were actually more likely to be impacted.

Time Series Time Series Analysis

Using Paraphrases to Study Properties of Contextual Embeddings

no code implementations NAACL 2022 Laura Burdick, Jonathan K. Kummerfeld, Rada Mihalcea

We use paraphrases as a unique source of data to analyze contextualized embeddings, with a particular focus on BERT.

Logical Fallacy Detection

2 code implementations28 Feb 2022 Zhijing Jin, Abhinav Lalwani, Tejas Vaidhya, Xiaoyu Shen, Yiwen Ding, Zhiheng Lyu, Mrinmaya Sachan, Rada Mihalcea, Bernhard Schölkopf

In this paper, we propose the task of logical fallacy detection, and provide a new dataset (Logic) of logical fallacies generally found in text, together with an additional challenge set for detecting logical fallacies in climate change claims (LogicClimate).

Language Modelling Logical Fallacies +2

Matching Tweets With Applicable Fact-Checks Across Languages

no code implementations14 Feb 2022 Ashkan Kazemi, Zehua Li, Verónica Pérez-Rosas, Scott A. Hale, Rada Mihalcea

We conduct both classification and retrieval experiments, in monolingual (English only), multilingual (Spanish, Portuguese), and cross-lingual (Hindi-English) settings using multilingual transformer models such as XLM-RoBERTa and multilingual embeddings such as LaBSE and SBERT.

Fact Checking Retrieval

Micromodels for Efficient, Explainable, and Reusable Systems: A Case Study on Mental Health

1 code implementation Findings (EMNLP) 2021 Andrew Lee, Jonathan K. Kummerfeld, Lawrence C. An, Rada Mihalcea

Many statistical models have high accuracy on test benchmarks, but are not explainable, struggle in low-resource scenarios, cannot be reused for multiple tasks, and cannot easily integrate domain expertise.

Classification

Exemplars-guided Empathetic Response Generation Controlled by the Elements of Human Communication

1 code implementation22 Jun 2021 Navonil Majumder, Deepanway Ghosal, Devamanyu Hazarika, Alexander Gelbukh, Rada Mihalcea, Soujanya Poria

We empirically show that these approaches yield significant improvements in empathetic response quality in terms of both automated and human-evaluated metrics.

Empathetic Response Generation Passage Retrieval +2

How Good Is NLP? A Sober Look at NLP Tasks through the Lens of Social Impact

2 code implementations Findings (ACL) 2021 Zhijing Jin, Geeticka Chauhan, Brian Tse, Mrinmaya Sachan, Rada Mihalcea

We lay the foundations via the moral philosophy definition of social good, propose a framework to evaluate the direct and indirect real-world impact of NLP tasks, and adopt the methodology of global priorities research to identify priority causes for NLP research.

Philosophy

MUSER: MUltimodal Stress Detection using Emotion Recognition as an Auxiliary Task

no code implementations NAACL 2021 Yiqun Yao, Michalis Papakostas, Mihai Burzo, Mohamed Abouelenien, Rada Mihalcea

The capability to automatically detect human stress can benefit artificial intelligent agents involved in affective computing and human-computer interaction.

Emotion Recognition Multi-Task Learning

Extractive and Abstractive Explanations for Fact-Checking and Evaluation of News

no code implementations NAACL (NLP4IF) 2021 Ashkan Kazemi, Zehua Li, Verónica Pérez-Rosas, Rada Mihalcea

In this paper, we explore the construction of natural language explanations for news claims, with the goal of assisting fact-checking and news evaluation applications.

Fact Checking Language Modelling +1

FIBER: Fill-in-the-Blanks as a Challenging Video Understanding Evaluation Framework

1 code implementation ACL 2022 Santiago Castro, Ruoyao Wang, Pingxuan Huang, Ian Stewart, Oana Ignat, Nan Liu, Jonathan C. Stroud, Rada Mihalcea

We propose fill-in-the-blanks as a video understanding evaluation framework and introduce FIBER -- a novel dataset consisting of 28, 000 videos and descriptions in support of this evaluation framework.

Language Modelling Multiple-choice +4

Chord Embeddings: Analyzing What They Capture and Their Role for Next Chord Prediction and Artist Attribute Prediction

no code implementations4 Feb 2021 Allison Lahnala, Gauri Kambhatla, Jiajun Peng, Matthew Whitehead, Gillian Minnehan, Eric Guldan, Jonathan K. Kummerfeld, Anıl Çamcı, Rada Mihalcea

In the first case study, we demonstrate that using chord embeddings in a next chord prediction task yields predictions that more closely match those by experienced musicians.

Attribute

White Paper: Challenges and Considerations for the Creation of a Large Labelled Repository of Online Videos with Questionable Content

no code implementations25 Jan 2021 Thamar Solorio, Mahsa Shafaei, Christos Smailis, Mona Diab, Theodore Giannakopoulos, Heng Ji, Yang Liu, Rada Mihalcea, Smaranda Muresan, Ioannis Kakadiaris

This white paper presents a summary of the discussions regarding critical considerations to develop an extensive repository of online videos annotated with labels indicating questionable content.

Recognizing Emotion Cause in Conversations

1 code implementation22 Dec 2020 Soujanya Poria, Navonil Majumder, Devamanyu Hazarika, Deepanway Ghosal, Rishabh Bhardwaj, Samson Yu Bai Jian, Pengfei Hong, Romila Ghosh, Abhinaba Roy, Niyati Chhaya, Alexander Gelbukh, Rada Mihalcea

We address the problem of recognizing emotion cause in conversations, define two novel sub-tasks of this problem, and provide a corresponding dialogue-level dataset, along with strong Transformer-based baselines.

Causal Emotion Entailment Emotion Cause Extraction

Improving Zero Shot Learning Baselines with Commonsense Knowledge

no code implementations11 Dec 2020 Abhinaba Roy, Deepanway Ghosal, Erik Cambria, Navonil Majumder, Rada Mihalcea, Soujanya Poria

Zero shot learning -- the problem of training and testing on a completely disjoint set of classes -- relies greatly on its ability to transfer knowledge from train classes to test classes.

Word Embeddings Zero-Shot Learning

Building Location Embeddings from Physical Trajectories and Textual Representations

no code implementations Asian Chapter of the Association for Computational Linguistics 2020 Laura Biester, Carmen Banea, Rada Mihalcea

Word embedding methods have become the de-facto way to represent words, having been successfully applied to a wide array of natural language processing tasks.

Exploring the Value of Personalized Word Embeddings

no code implementations COLING 2020 Charles Welch, Jonathan K. Kummerfeld, Verónica Pérez-Rosas, Rada Mihalcea

Our results show that a subset of words belonging to specific psycholinguistic categories tend to vary more in their representations across users and that combining generic and personalized word embeddings yields the best performance, with a 4. 7% relative reduction in perplexity.

Authorship Attribution Language Modelling +1

Biased TextRank: Unsupervised Graph-Based Content Extraction

no code implementations COLING 2020 Ashkan Kazemi, Verónica Pérez-Rosas, Rada Mihalcea

We introduce Biased TextRank, a graph-based content extraction method inspired by the popular TextRank algorithm that ranks text spans according to their importance for language processing tasks and according to their relevance to an input "focus."

Deep Learning for Text Style Transfer: A Survey

2 code implementations CL (ACL) 2022 Di Jin, Zhijing Jin, Zhiting Hu, Olga Vechtomova, Rada Mihalcea

Text style transfer is an important task in natural language generation, which aims to control certain attributes in the generated text, such as politeness, emotion, humor, and many others.

Style Transfer Text Attribute Transfer +1

Compositional Demographic Word Embeddings

1 code implementation EMNLP 2020 Charles Welch, Jonathan K. Kummerfeld, Verónica Pérez-Rosas, Rada Mihalcea

Word embeddings are usually derived from corpora containing text from many individuals, thus leading to general purpose representations rather than individually personalized representations.

Language Modelling Word Embeddings

MIME: MIMicking Emotions for Empathetic Response Generation

1 code implementation EMNLP 2020 Navonil Majumder, Pengfei Hong, Shanshan Peng, Jiankun Lu, Deepanway Ghosal, Alexander Gelbukh, Rada Mihalcea, Soujanya Poria

Current approaches to empathetic response generation view the set of emotions expressed in the input text as a flat structure, where all the emotions are treated uniformly.

Empathetic Response Generation Response Generation

Improving Low Compute Language Modeling with In-Domain Embedding Initialisation

1 code implementation EMNLP 2020 Charles Welch, Rada Mihalcea, Jonathan K. Kummerfeld

In the process, we show that the standard convention of tying input and output embeddings does not improve perplexity when initializing with embeddings trained on in-domain data.

Language Modelling

Quantifying the Effects of COVID-19 on Mental Health Support Forums

no code implementations EMNLP (NLP-COVID19) 2020 Laura Biester, Katie Matton, Janarthanan Rajendran, Emily Mower Provost, Rada Mihalcea

The COVID-19 pandemic, like many of the disease outbreaks that have preceded it, is likely to have a profound effect on mental health.

Expressive Interviewing: A Conversational System for Coping with COVID-19

no code implementations EMNLP (NLP-COVID19) 2020 Charles Welch, Allison Lahnala, Verónica Pérez-Rosas, Siqi Shen, Sarah Seraj, Larry An, Kenneth Resnicow, James Pennebaker, Rada Mihalcea

The ongoing COVID-19 pandemic has raised concerns for many regarding personal and public health implications, financial security and economic stability.

KinGDOM: Knowledge-Guided DOMain adaptation for sentiment analysis

1 code implementation ACL 2020 Deepanway Ghosal, Devamanyu Hazarika, Abhinaba Roy, Navonil Majumder, Rada Mihalcea, Soujanya Poria

Cross-domain sentiment analysis has received significant attention in recent years, prompted by the need to combat the domain gap between different applications that make use of sentiment analysis.

Domain Adaptation Sentiment Analysis

Inferring Social Media Users' Mental Health Status from Multimodal Information

no code implementations LREC 2020 Zhentao Xu, Ver{\'o}nica P{\'e}rez-Rosas, Rada Mihalcea

In this paper, we explore the use of multimodal cues present in social media posts to predict users{'} mental health status.

General Classification

MuSE: a Multimodal Dataset of Stressed Emotion

no code implementations LREC 2020 Mimansa Jaiswal, Cristian-Paul Bara, Yuanhang Luo, Mihai Burzo, Rada Mihalcea, Emily Mower Provost

Endowing automated agents with the ability to provide support, entertainment and interaction with human beings requires sensing of the users{'} affective state.

Emotion Classification General Classification

Small Town or Metropolis? Analyzing the Relationship between Population Size and Language

no code implementations LREC 2020 Amy Rechkemmer, Steven Wilson, Rada Mihalcea

Using a set of over 2 million posts from distinct Twitter users around the country dating back as far as 2014, we ask the following question: is there a difference in how Americans express themselves online depending on whether they reside in an urban or rural area?

Cultural Vocal Bursts Intensity Prediction

Analyzing the Surprising Variability in Word Embedding Stability Across Languages

1 code implementation EMNLP 2021 Laura Burdick, Jonathan K. Kummerfeld, Rada Mihalcea

Word embeddings are powerful representations that form the foundation of many natural language processing architectures, both in English and in other languages.

Word Embeddings

Compositional Temporal Visual Grounding of Natural Language Event Descriptions

no code implementations4 Dec 2019 Jonathan C. Stroud, Ryan McCaffrey, Rada Mihalcea, Jia Deng, Olga Russakovsky

Temporal grounding entails establishing a correspondence between natural language event descriptions and their visual depictions.

Visual Grounding

Towards Extracting Medical Family History from Natural Language Interactions: A New Dataset and Baselines

no code implementations IJCNLP 2019 Mahmoud Azab, Stephane Dadian, Vivi Nastase, Larry An, Rada Mihalcea

We introduce a new dataset consisting of natural language interactions annotated with medical family histories, obtained during interactions with a genetic counselor and through crowdsourcing, following a questionnaire created by experts in the domain.

Relation Extraction

Representing Movie Characters in Dialogues

no code implementations CONLL 2019 Mahmoud Azab, Noriyuki Kojima, Jia Deng, Rada Mihalcea

We introduce a new embedding model to represent movie characters and their interactions in a dialogue by encoding in the same representation the language used by these characters as well as information about the other participants in the dialogue.

Question Answering Relation Classification +1

Conversational Transfer Learning for Emotion Recognition

1 code implementation11 Oct 2019 Devamanyu Hazarika, Soujanya Poria, Roger Zimmermann, Rada Mihalcea

We propose an approach, TL-ERC, where we pre-train a hierarchical dialogue model on multi-turn conversations (source) and then transfer its parameters to a conversational emotion classifier (target).

Emotion Recognition in Conversation Sentence +1

Towards Automatic Detection of Misinformation in Online Medical Videos

no code implementations4 Sep 2019 Rui Hou, Verónica Pérez-Rosas, Stacy Loeb, Rada Mihalcea

Recent years have witnessed a significant increase in the online sharing of medical information, with videos representing a large fraction of such online sources.

Misinformation

Variational Fusion for Multimodal Sentiment Analysis

no code implementations13 Aug 2019 Navonil Majumder, Soujanya Poria, Gangeshwar Krishnamurthy, Niyati Chhaya, Rada Mihalcea, Alexander Gelbukh

Multimodal fusion is considered a key step in multimodal tasks such as sentiment analysis, emotion detection, question answering, and others.

Multimodal Sentiment Analysis Question Answering

Predicting Human Activities from User-Generated Content

no code implementations ACL 2019 Steven R. Wilson, Rada Mihalcea

The activities we do are linked to our interests, personality, political preferences, and decisions we make about the future.

Clustering Sentence +2

What Makes a Good Counselor? Learning to Distinguish between High-quality and Low-quality Counseling Conversations

no code implementations ACL 2019 Ver{\'o}nica P{\'e}rez-Rosas, Xinyi Wu, Kenneth Resnicow, Rada Mihalcea

Our results suggest important language differences in low- and high-quality counseling, which we further use to derive linguistic features able to capture the differences between the two groups.

Women's Syntactic Resilience and Men's Grammatical Luck: Gender-Bias in Part-of-Speech Tagging and Dependency Parsing

no code implementations ACL 2019 Aparna Garimella, Carmen Banea, Dirk Hovy, Rada Mihalcea

Several linguistic studies have shown the prevalence of various lexical and grammatical patterns in texts authored by a person of a particular gender, but models for part-of-speech tagging and dependency parsing have still not adapted to account for these differences.

Dependency Parsing Part-Of-Speech Tagging

Towards Multimodal Sarcasm Detection (An \_Obviously\_ Perfect Paper)

1 code implementation ACL 2019 Santiago Castro, Devamanyu Hazarika, Ver{\'o}nica P{\'e}rez-Rosas, Roger Zimmermann, Rada Mihalcea, Soujanya Poria

As a first step towards enabling the development of multimodal approaches for sarcasm detection, we propose a new sarcasm dataset, Multimodal Sarcasm Detection Dataset (MUStARD), compiled from popular TV shows.

Sarcasm Detection

Towards Multimodal Sarcasm Detection (An _Obviously_ Perfect Paper)

1 code implementation5 Jun 2019 Santiago Castro, Devamanyu Hazarika, Verónica Pérez-Rosas, Roger Zimmermann, Rada Mihalcea, Soujanya Poria

As a first step towards enabling the development of multimodal approaches for sarcasm detection, we propose a new sarcasm dataset, Multimodal Sarcasm Detection Dataset (MUStARD), compiled from popular TV shows.

Sarcasm Detection

Box of Lies: Multimodal Deception Detection in Dialogues

no code implementations NAACL 2019 Felix Soldner, Ver{\'o}nica P{\'e}rez-Rosas, Rada Mihalcea

Deception often takes place during everyday conversations, yet conversational dialogues remain largely unexplored by current work on automatic deception detection.

Deception Detection General Classification

Look Who's Talking: Inferring Speaker Attributes from Personal Longitudinal Dialog

1 code implementation25 Apr 2019 Charles Welch, Verónica Pérez-Rosas, Jonathan K. Kummerfeld, Rada Mihalcea

We examine a large dialog corpus obtained from the conversation history of a single individual with 104 conversation partners.

Attribute

A Comparative Analysis of Content-based Geolocation in Blogs and Tweets

no code implementations19 Nov 2018 Konstantinos Pappas, Mahmoud Azab, Rada Mihalcea

The geolocation of online information is an essential component in any geospatial application.

DialogueRNN: An Attentive RNN for Emotion Detection in Conversations

2 code implementations1 Nov 2018 Navonil Majumder, Soujanya Poria, Devamanyu Hazarika, Rada Mihalcea, Alexander Gelbukh, Erik Cambria

Emotion detection in conversations is a necessary step for a number of applications, including opinion mining over chat history, social media threads, debates, argumentation mining, understanding consumer feedback in live conversations, etc.

Emotion Classification Emotion Recognition in Conversation +2

Speaker Naming in Movies

no code implementations NAACL 2018 Mahmoud Azab, Mingzhe Wang, Max Smith, Noriyuki Kojima, Jia Deng, Rada Mihalcea

We propose a new model for speaker naming in movies that leverages visual, textual, and acoustic modalities in an unified optimization framework.

Multi-Label Transfer Learning for Multi-Relational Semantic Similarity

no code implementations SEMEVAL 2019 Li Zhang, Steven R. Wilson, Rada Mihalcea

Multi-relational semantic similarity datasets define the semantic relations between two short texts in multiple ways, e. g., similarity, relatedness, and so on.

Multi-Task Learning regression +3

Factors Influencing the Surprising Instability of Word Embeddings

2 code implementations NAACL 2018 Laura Wendlandt, Jonathan K. Kummerfeld, Rada Mihalcea

Despite the recent popularity of word embedding methods, there is only a small body of work exploring the limitations of these representations.

Word Embeddings

Direct Network Transfer: Transfer Learning of Sentence Embeddings for Semantic Similarity

no code implementations20 Apr 2018 Li Zhang, Steven R. Wilson, Rada Mihalcea

Sentence encoders, which produce sentence embeddings using neural networks, are typically evaluated by how well they transfer to downstream tasks.

Natural Language Understanding Semantic Similarity +5

Identifying Usage Expression Sentences in Consumer Product Reviews

no code implementations IJCNLP 2017 Shibamouli Lahiri, V.G.Vinod Vydiswaran, Rada Mihalcea

The system combines lexical, syntactic, and semantic features in a product-agnostic fashion to yield good classification performance.

Classification General Classification

Measuring Semantic Relations between Human Activities

no code implementations IJCNLP 2017 Steven Wilson, Rada Mihalcea

The things people do in their daily lives can provide valuable insights into their personality, values, and interests.

Semantic Textual Similarity

Demographic-aware word associations

no code implementations EMNLP 2017 Aparna Garimella, Carmen Banea, Rada Mihalcea

Variations of word associations across different groups of people can provide insights into people{'}s psychologies and their world views.

Information Retrieval Keyword Extraction +1

Automatic Detection of Fake News

no code implementations COLING 2018 Verónica Pérez-Rosas, Bennett Kleinberg, Alexandra Lefevre, Rada Mihalcea

The proliferation of misleading information in everyday access media outlets such as social media feeds, news blogs, and online newspapers have made it challenging to identify trustworthy news sources, thus increasing the need for computational tools able to provide insights into the reliability of online content.

Fake News Detection

Predicting Counselor Behaviors in Motivational Interviewing Encounters

no code implementations EACL 2017 Ver{\'o}nica P{\'e}rez-Rosas, Rada Mihalcea, Kenneth Resnicow, Satinder Singh, Lawrence An, Kathy J. Goggin, Delwyn Catley

As the number of people receiving psycho-therapeutic treatment increases, the automatic evaluation of counseling practice arises as an important challenge in the clinical domain.

A Computational Analysis of the Language of Drug Addiction

no code implementations EACL 2017 Carlo Strapparava, Rada Mihalcea

We present a computational analysis of the language of drug users when talking about their drug experiences.

Predicting the Industry of Users on Social Media

no code implementations24 Dec 2016 Konstantinos Pappas, Rada Mihalcea

Automatic profiling of social media users is an important task for supporting a multitude of downstream applications.

Ensemble Learning Feature Engineering +1

Stateology: State-Level Interactive Charting of Language, Feelings, and Values

no code implementations20 Dec 2016 Konstantinos Pappas, Steven Wilson, Rada Mihalcea

People's personality and motivations are manifest in their everyday language usage.

Targeted Sentiment to Understand Student Comments

no code implementations COLING 2016 Charles Welch, Rada Mihalcea

We address the task of targeted sentiment as a means of understanding the sentiment that students hold toward courses and instructors, as expressed by students in their comments.

Decision Making Entity Extraction using GAN +1

Identifying Cross-Cultural Differences in Word Usage

no code implementations COLING 2016 Aparna Garimella, Rada Mihalcea, James Pennebaker

Personal writings have inspired researchers in the fields of linguistics and psychology to study the relationship between language and culture to better understand the psychology of people across different cultures.

Cultural Vocal Bursts Intensity Prediction

Zooming in on Gender Differences in Social Media

no code implementations WS 2016 Aparna Garimella, Rada Mihalcea

Men are from Mars and women are from Venus - or so the genre of relationship literature would have us believe.

General Classification Sociology +3

Building a Dataset for Possessions Identification in Text

no code implementations LREC 2016 Carmen Banea, Xi Chen, Rada Mihalcea

Just as industrialization matured from mass production to customization and personalization, so has the Web migrated from generic content to public disclosures of one{'}s most intimately held thoughts, opinions and beliefs.

Mining Semantic Affordances of Visual Object Categories

no code implementations CVPR 2015 Yu-Wei Chao, Zhan Wang, Rada Mihalcea, Jia Deng

In this paper we introduce the new problem of mining the knowledge of semantic affordance: given an object, determining whether an action can be performed on it.

Collaborative Filtering Object

Modeling Language Proficiency Using Implicit Feedback

no code implementations LREC 2014 Chris Hokamp, Rada Mihalcea, Peter Schuelke

We describe the results of several experiments with interactive interfaces for native and L2 English students, designed to collect implicit feedback from students as they complete a reading activity.

Reading Comprehension Text Simplification

A Multimodal Dataset for Deception Detection

no code implementations LREC 2014 Ver{\'o}nica P{\'e}rez-Rosas, Rada Mihalcea, Alexis Narvaez, Mihai Burzo

This paper presents the construction of a multimodal dataset for deception detection, including physiological, thermal, and visual responses of human subjects under three deceptive scenarios.

Deception Detection

Authorship Attribution Using Word Network Features

no code implementations12 Nov 2013 Shibamouli Lahiri, Rada Mihalcea

The goal of our paper is to explore properties of these complex networks that are suitable as features for machine-learning-based authorship attribution of documents.

Authorship Attribution BIG-bench Machine Learning