Search Results for author: Soroush Vosoughi

Found 83 papers, 31 papers with code

TWEETSPIN: Fine-grained Propaganda Detection in Social Media Using Multi-View Representations

no code implementations NAACL 2022 Prashanth Vijayaraghavan, Soroush Vosoughi

Our model relies on multi-view representations of the input tweet data to (a) extract different aspects of the input text including the context, entities, their relationships, and external knowledge; (b) model their mutual interplay; and (c) effectively speed up the learning process by requiring fewer training examples.

Implicit Relations Logical Fallacies +1

Aligning Generative Language Models with Human Values

no code implementations Findings (NAACL) 2022 Ruibo Liu, Ge Zhang, Xinyu Feng, Soroush Vosoughi

Although current large-scale generative language models (LMs) can show impressive insights about factual knowledge, they do not exhibit similar success with respect to human values judgements (e. g., whether or not the generations of an LM are moral).

Text Generation Transfer Learning

Dartmouth at SemEval-2022 Task 6: Detection of Sarcasm

no code implementations SemEval (NAACL) 2022 Rishik Lad, Weicheng Ma, Soroush Vosoughi

This paper introduces the result of Team Dartmouth’s experiments on each of the five subtasks for the detection of sarcasm in English and Arabic tweets.

Data Augmentation Sarcasm Detection

Communication is All You Need: Persuasion Dataset Construction via Multi-LLM Communication

no code implementations13 Feb 2025 Weicheng Ma, Hefan Zhang, Ivory Yang, Shiyu Ji, Joice Chen, Farnoosh Hashemi, Shubham Mohole, Ethan Gearey, Michael Macy, Saeed Hassanpour, Soroush Vosoughi

Large Language Models (LLMs) have shown proficiency in generating persuasive dialogue, yet concerns about the fluency and sophistication of their outputs persist.

All Diversity

Temporal Working Memory: Query-Guided Segment Refinement for Enhanced Multimodal Understanding

1 code implementation9 Feb 2025 Xingjian Diao, Chunhui Zhang, Weiyi Wu, Zhongyu Ouyang, Peijun Qing, Ming Cheng, Soroush Vosoughi, Jiang Gui

To overcome these challenges, we introduce a specialized cognitive module, temporal working memory (TWM), which aims to enhance the temporal modeling capabilities of MFMs.

Image Captioning Image-text Retrieval +5

Is It Navajo? Accurate Language Detection in Endangered Athabaskan Languages

1 code implementation27 Jan 2025 Ivory Yang, Weicheng Ma, Chunhui Zhang, Soroush Vosoughi

Endangered languages, such as Navajo - the most widely spoken Native American language - are significantly underrepresented in contemporary language technologies, exacerbating the challenges of their preservation and revitalization.

Diversity Language Identification +3

NushuRescue: Revitalization of the Endangered Nushu Language with AI

1 code implementation29 Nov 2024 Ivory Yang, Weicheng Ma, Soroush Vosoughi

The preservation and revitalization of endangered and extinct languages is a meaningful endeavor, conserving cultural heritage while enriching fields like linguistics and anthropology.

Sentence

ImpScore: A Learnable Metric For Quantifying The Implicitness Level of Language

1 code implementation7 Nov 2024 Yuxin Wang, Xiaomeng Zhu, Weimin Lyu, Saeed Hassanpour, Soroush Vosoughi

Handling implicit language is essential for natural language processing systems to achieve precise text understanding and facilitate natural interactions with users.

Contrastive Learning Hate Speech Detection +1

Achieving Domain-Independent Certified Robustness via Knowledge Continuity

1 code implementation3 Nov 2024 Alan Sun, Chiyu Ma, Kenneth Ge, Soroush Vosoughi

We present knowledge continuity, a novel definition inspired by Lipschitz continuity which aims to certify the robustness of neural networks across input domains (such as continuous and discrete domains in vision and language, respectively).

On the Exploration of LM-Based Soft Modular Robot Design

no code implementations1 Nov 2024 Weicheng Ma, Luyang Zhao, Chun-Yi She, Yitao Jiang, Alan Sun, Bo Zhu, Devin Balkcom, Soroush Vosoughi

Recent large language models (LLMs) have demonstrated promising capabilities in modeling real-world knowledge and enhancing knowledge-based generation tasks.

World Knowledge

Interpretable Image Classification with Adaptive Prototype-based Vision Transformers

1 code implementation28 Oct 2024 Chiyu Ma, Jon Donnelly, Wenjun Liu, Soroush Vosoughi, Cynthia Rudin, Chaofan Chen

We present ProtoViT, a method for interpretable image classification combining deep learning and case-based reasoning.

Image Classification

AlphaLoRA: Assigning LoRA Experts Based on Layer Training Quality

1 code implementation14 Oct 2024 Peijun Qing, Chongyang Gao, Yefan Zhou, Xingjian Diao, Yaoqing Yang, Soroush Vosoughi

However, inspired by the observed redundancy in traditional MoE structures, previous studies identify similar redundancy among LoRA experts within the MoE architecture, highlighting the necessity for non-uniform allocation of LoRA experts across different layers.

parameter-efficient fine-tuning

Enhanced Detection of Conversational Mental Manipulation Through Advanced Prompting Techniques

no code implementations14 Aug 2024 Ivory Yang, Xiaobo Guo, Sean Xie, Soroush Vosoughi

This study presents a comprehensive, long-term project to explore the effectiveness of various prompting techniques in detecting dialogical mental manipulation.

Semantic Compositions Enhance Vision-Language Contrastive Learning

no code implementations1 Jul 2024 Maxwell Aladago, Lorenzo Torresani, Soroush Vosoughi

In the field of vision-language contrastive learning, models such as CLIP capitalize on matched image-caption pairs as positive examples and leverage within-batch non-matching pairs as negatives.

Classification Contrastive Learning +6

Serial Position Effects of Large Language Models

no code implementations23 Jun 2024 Xiaobo Guo, Soroush Vosoughi

Large Language Models (LLMs) have shown remarkable capabilities in zero-shot learning applications, generating responses to queries using only pre-training information without the need for additional fine-tuning.

Position Zero-Shot Learning

Judging the Judges: A Systematic Study of Position Bias in LLM-as-a-Judge

1 code implementation12 Jun 2024 Lin Shi, Chiyu Ma, Wenhua Liang, Weicheng Ma, Soroush Vosoughi

LLM-as-a-Judge presents a promising alternative to human evaluators across various tasks, but inherent biases, especially position bias - a tendency to favor solutions based on their position in the prompt - have compromised its effectiveness.

Fairness Position

MODABS: Multi-Objective Learning for Dynamic Aspect-Based Summarization

no code implementations5 Jun 2024 Xiaobo Guo, Soroush Vosoughi

The rapid proliferation of online content necessitates effective summarization methods, among which dynamic aspect-based summarization stands out.

Decoder

MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations

1 code implementation26 May 2024 Yuxin Wang, Ivory Yang, Saeed Hassanpour, Soroush Vosoughi

We anticipate that ${\rm M{\small ental}M{\small anip}}$ will stimulate further research, leading to progress in both understanding and mitigating the impact of mental manipulation in conversations.

Disordered-DABS: A Benchmark for Dynamic Aspect-Based Summarization in Disordered Texts

1 code implementation16 Feb 2024 Xiaobo Guo, Soroush Vosoughi

Aspect-based summarization has seen significant advancements, especially in structured text.

Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models

1 code implementation3 Nov 2023 Sean Xie, Soroush Vosoughi, Saeed Hassanpour

Large Language Models (LLMs) have significantly advanced the field of Natural Language Processing (NLP), but their lack of interpretability has been a major concern.

Improving Representation Learning for Histopathologic Images with Cluster Constraints

1 code implementation ICCV 2023 Weiyi Wu, Chongyang Gao, Joseph DiPalma, Soroush Vosoughi, Saeed Hassanpour

This framework aims for transferable representation learning and semantically meaningful clustering by synergizing invariance loss and clustering loss in WSI analysis.

Clustering Representation Learning +1

Expedited Training of Visual Conditioned Language Generation via Redundancy Reduction

1 code implementation5 Oct 2023 Yiren Jian, Tingkai Liu, Yunzhe Tao, Chunhui Zhang, Soroush Vosoughi, Hongxia Yang

Our experimental findings demonstrate that our approach accelerates the training of vision-language models by a factor of 5 without a noticeable impact on overall performance.

Representation Learning Text Generation

Bootstrapping Vision-Language Learning with Decoupled Language Pre-training

1 code implementation NeurIPS 2023 Yiren Jian, Chongyang Gao, Soroush Vosoughi

We present a novel methodology aimed at optimizing the application of frozen large language models (LLMs) for resource-intensive vision-language (VL) pre-training.

Image to text

Graph-Level Embedding for Time-Evolving Graphs

no code implementations1 Jun 2023 Lili Wang, Chenghan Huang, Weicheng Ma, Xinyuan Cao, Soroush Vosoughi

We evaluate our proposed model on five publicly available datasets for the task of temporal graph similarity ranking, and our model outperforms baseline methods.

Anomaly Detection Graph Representation Learning +5

Joint Latent Topic Discovery and Expectation Modeling for Financial Markets

no code implementations1 Jun 2023 Lili Wang, Chenghan Huang, Chongyang Gao, Weicheng Ma, Soroush Vosoughi

In the pursuit of accurate and scalable quantitative methods for financial market analysis, the focus has shifted from individual stock models to those capturing interrelations between companies and their stocks.

Training Socially Aligned Language Models on Simulated Social Interactions

1 code implementation26 May 2023 Ruibo Liu, Ruixin Yang, Chenyan Jia, Ge Zhang, Denny Zhou, Andrew M. Dai, Diyi Yang, Soroush Vosoughi

Social alignment in AI systems aims to ensure that these models behave according to established societal values.

Knowledge from Large-Scale Protein Contact Prediction Models Can Be Transferred to the Data-Scarce RNA Contact Prediction Task

1 code implementation13 Feb 2023 Yiren Jian, Chongyang Gao, Chen Zeng, Yunjie Zhao, Soroush Vosoughi

Our findings indicate that the learned structural patterns of proteins can be transferred to RNAs, opening up potential new avenues for research.

Prediction Transfer Learning

Mind's Eye: Grounded Language Model Reasoning through Simulation

no code implementations11 Oct 2022 Ruibo Liu, Jason Wei, Shixiang Shane Gu, Te-Yen Wu, Soroush Vosoughi, Claire Cui, Denny Zhou, Andrew M. Dai

By training solely on written text, current language models (LMs) miss the grounded experience of humans in the real-world -- their failure to relate language to the physical world causes knowledge to be misrepresented and obvious mistakes in their reasoning.

Language Modeling Language Modelling +2

Language Models are Multilingual Chain-of-Thought Reasoners

4 code implementations6 Oct 2022 Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei

Finally, we show that the multilingual reasoning abilities of language models extend to other tasks such as commonsense reasoning and word-in-context semantic judgment.

GSM8K Math

Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings

1 code implementation20 Sep 2022 Yiren Jian, Chongyang Gao, Soroush Vosoughi

This indicates that Transformer models are able to generalize better by doing a similar task (i. e., clustering) with unpaired examples from different modalities in a multi-task fashion.

Clustering Contrastive Learning +3

Interpretation Quality Score for Measuring the Quality of interpretability methods

no code implementations24 May 2022 Yuansheng Xie, Soroush Vosoughi, Saeed Hassanpour

Machine learning (ML) models have been applied to a wide range of natural language processing (NLP) tasks in recent years.

Embedding Hallucination for Few-Shot Language Fine-tuning

1 code implementation NAACL 2022 Yiren Jian, Chongyang Gao, Soroush Vosoughi

Few-shot language learners adapt knowledge from a pre-trained model to recognize novel classes from a few-labeled sentences.

Data Augmentation Hallucination +2

Contrastive Learning for Prompt-Based Few-Shot Language Learners

1 code implementation NAACL 2022 Yiren Jian, Chongyang Gao, Soroush Vosoughi

Following this line of work, we present a contrastive learning framework that clusters inputs from the same class for better generality of models trained with only limited examples.

Contrastive Learning In-Context Learning +3

Non-Parallel Text Style Transfer with Self-Parallel Supervision

1 code implementation ICLR 2022 Ruibo Liu, Chongyang Gao, Chenyan Jia, Guangxuan Xu, Soroush Vosoughi

The performance of existing text style transfer models is severely limited by the non-parallel datasets on which the models are trained.

Imitation Learning Style Transfer +1

Knowledge Infused Decoding

1 code implementation ICLR 2022 Ruibo Liu, Guoqing Zheng, Shashank Gupta, Radhika Gaonkar, Chongyang Gao, Soroush Vosoughi, Milad Shokouhi, Ahmed Hassan Awadallah

Hence, they tend to suffer from counterfactual or hallucinatory generation when used in knowledge-intensive natural language generation (NLG) tasks.

counterfactual Question Answering +1

Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning

no code implementations30 Mar 2022 Sean Xie, Soroush Vosoughi, Saeed Hassanpour

Artificial intelligence, particularly through recent advancements in deep learning, has achieved exceptional performances in many tasks in fields such as natural language processing and computer vision.

Decision Making Deep Reinforcement Learning +2

EnCBP: A New Benchmark Dataset for Finer-Grained Cultural Background Prediction in English

no code implementations Findings (ACL) 2022 Weicheng Ma, Samiha Datta, Lili Wang, Soroush Vosoughi

While cultural backgrounds have been shown to affect linguistic expressions, existing natural language processing (NLP) research on culture modeling is overly coarse-grained and does not examine cultural differences among speakers of the same language.

Cultural Vocal Bursts Intensity Prediction Language Modeling +6

Emotion-based Modeling of Mental Disorders on Social Media

no code implementations24 Jan 2022 Xiaobo Guo, Yaojia Sun, Soroush Vosoughi

Our proposed model is different from other work in this area in that our model is based entirely on the emotional states, and the transition between these states of users on Reddit, whereas prior work is typically based on content-based representations (e. g., n-grams, language model embeddings, etc).

Language Modelling

Embedding Node Structural Role Identity Using Stress Majorization

no code implementations14 Sep 2021 Lili Wang, Chenghan Huang, Weicheng Ma, Ying Lu, Soroush Vosoughi

In this paper, we present a novel and flexible framework using stress majorization, to transform the high-dimensional role identities in networks directly (without approximation or indirect modeling) to a low-dimensional embedding space.

Node Classification

GradTS: A Gradient-Based Automatic Auxiliary Task Selection Method Based on Transformer Networks

no code implementations EMNLP 2021 Weicheng Ma, Renze Lou, Kai Zhang, Lili Wang, Soroush Vosoughi

Compared to AUTOSEM, a strong baseline method, GradTS improves the performance of MT-DNN with a bert-base-cased backend model, from 0. 33% to 17. 93% on 8 natural language understanding (NLU) tasks in the GLUE benchmarks.

Multi-Task Learning Natural Language Understanding

Language Model Augmented Relevance Score

no code implementations ACL 2021 Ruibo Liu, Jason Wei, Soroush Vosoughi

Although automated metrics are commonly used to evaluate NLG systems, they often correlate poorly with human judgements.

Language Modeling Language Modelling +2

Contributions of Transformer Attention Heads in Multi- and Cross-lingual Tasks

no code implementations ACL 2021 Weicheng Ma, Kai Zhang, Renze Lou, Lili Wang, Soroush Vosoughi

Through extensive experiments, we show that (1) pruning a number of attention heads in a multi-lingual Transformer-based model has, in general, positive effects on its performance in cross-lingual and multi-lingual tasks and (2) the attention heads to be pruned can be ranked using gradients and identified with a few trial experiments.

XLM-R

Modulating Language Models with Emotions

no code implementations Findings (ACL) 2021 Ruibo Liu, Jason Wei, Chenyan Jia, Soroush Vosoughi

Generating context-aware language that embodies diverse emotions is an important step towards building empathetic NLP systems.

Diversity Response Generation

Linguistic Complexity Loss in Text-Based Therapy

no code implementations NAACL 2021 Jason Wei, Kelly Finn, Emma Templeton, Thalia Wheatley, Soroush Vosoughi

The recent advent of online text-based therapy presents a new opportunity to analyze the complexity loss paradox in a novel operationalization: linguistic complexity loss in text-based therapy conversations.

Diversity

A Survey of Data Augmentation Approaches for NLP

1 code implementation Findings (ACL) 2021 Steven Y. Feng, Varun Gangal, Jason Wei, Sarath Chandar, Soroush Vosoughi, Teruko Mitamura, Eduard Hovy

In this paper, we present a comprehensive and unifying survey of data augmentation for NLP by summarizing the literature in a structured manner.

Data Augmentation Survey

Mitigating Political Bias in Language Models Through Reinforced Calibration

no code implementations30 Apr 2021 Ruibo Liu, Chenyan Jia, Jason Wei, Guangxuan Xu, Lili Wang, Soroush Vosoughi

Current large-scale language models can be politically biased as a result of the data they are trained on, potentially causing serious problems when they are deployed in real-world settings.

reinforcement-learning Reinforcement Learning (RL) +1

BigGreen at SemEval-2021 Task 1: Lexical Complexity Prediction with Assembly Models

1 code implementation SEMEVAL 2021 Aadil Islam, Weicheng Ma, Soroush Vosoughi

This paper describes a system submitted by team BigGreen to LCP 2021 for predicting the lexical complexity of English words in a given context.

Feature Engineering Lexical Complexity Prediction

Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning

1 code implementation NAACL 2021 Jason Wei, Chengyu Huang, Soroush Vosoughi, Yu Cheng, Shiqi Xu

Few-shot text classification is a fundamental NLP task in which a model aims to classify text into a large number of categories, given only a few training examples per category.

Data Augmentation Few-Shot Text Classification +3

Feature Selection for Multivariate Time Series via Network Pruning

1 code implementation11 Feb 2021 Kang Gu, Soroush Vosoughi, Temiloluwa Prioleau

In recent years, there has been an ever increasing amount of multivariate time series (MTS) data in various domains, typically generated by a large family of sensors such as wearable devices.

feature selection Network Pruning +2

Text Augmentation in a Multi-Task View

no code implementations EACL 2021 Jason Wei, Chengyu Huang, Shiqi Xu, Soroush Vosoughi

Traditional data augmentation aims to increase the coverage of the input distribution by generating augmented examples that strongly resemble original samples in an online fashion where augmented examples dominate training.

Text Augmentation text-classification +1

Political Depolarization of News Articles Using Attribute-aware Word Embeddings

no code implementations5 Jan 2021 Ruibo Liu, Lili Wang, Chenyan Jia, Soroush Vosoughi

To detect polar words, we train a multi-attribute-aware word embedding model that is aware of ideology and topics on 360k full-length media articles.

Attribute Text Generation +1

Social media data reveals signal for public consumer perceptions

no code implementations26 Dec 2020 Neeti Pokhriyal, Abenezer Dara, Benjamin Valentino, Soroush Vosoughi

By using decadal data (2008-2019) from Reddit, we show that both monthly and daily estimates of CCI can, indeed, be reliably estimated at least several months in advance, and that our model estimates are far superior to those generated by the existing methods.

Multi-modal Identification of State-Sponsored Propaganda on Social Media

no code implementations24 Dec 2020 Xiaobo Guo, Soroush Vosoughi

The prevalence of state-sponsored propaganda on the Internet has become a cause for concern in the recent years.

Improvements and Extensions on Metaphor Detection

no code implementations ACL (unimplicit) 2021 Weicheng Ma, Ruibo Liu, Lili Wang, Soroush Vosoughi

Finally, we clean up the improper or outdated annotations in one of the MD benchmark datasets and re-benchmark it with our Transformer-based model.

Natural Language Understanding

Enhanced Offensive Language Detection Through Data Augmentation

no code implementations5 Dec 2020 Ruibo Liu, Guangxuan Xu, Soroush Vosoughi

In this work, we present Dager (Data Augmenter), a generation-based data augmentation method, that improves the performance of classification on imbalanced and low-resource data such as the offensive language dataset.

Data Augmentation Task 2

Embedding Node Structural Role Identity into Hyperbolic Space

no code implementations3 Nov 2020 Lili Wang, Ying Lu, Chenghan Huang, Soroush Vosoughi

However, the work on network embedding in hyperbolic space has been focused on microscopic node embedding.

Network Embedding

Towards Improved Model Design for Authorship Identification: A Survey on Writing Style Understanding

no code implementations30 Sep 2020 Weicheng Ma, Ruibo Liu, Li-Li Wang, Soroush Vosoughi

While other tasks based on linguistic style understanding benefit from deep learning methods, these methods have not behaved as well as traditional machine learning methods in many authorship-based tasks.

BIG-bench Machine Learning Natural Language Understanding +1

Emoji Prediction: Extensions and Benchmarking

1 code implementation14 Jul 2020 Weicheng Ma, Ruibo Liu, Lili Wang, Soroush Vosoughi

In this paper, we extend the existing setting of the emoji prediction task to include a richer set of emojis and to allow multi-label classification on the task.

Benchmarking Multi-Label Classification +2

Query-Free Adversarial Transfer via Undertrained Surrogates

no code implementations1 Jul 2020 Chris Miller, Soroush Vosoughi

Deep neural networks are vulnerable to adversarial examples -- minor perturbations added to a model's input which cause the model to output an incorrect prediction.

Adversarial Attack

Salienteye: Maximizing Engagement While Maintaining Artistic Style on Instagram Using Deep Neural Networks

no code implementations13 Jun 2020 Lili Wang, Ruibo Liu, Soroush Vosoughi

Once trained on their accounts, users can have new photos sorted based on predicted engagement and style similarity to their previous work, thus enabling them to upload photos that not only have the potential to maximize engagement from their followers but also maintain their style of photography.

Object Recognition Transfer Learning

What Are People Asking About COVID-19? A Question Classification Dataset

2 code implementations ACL 2020 Jerry Wei, Chengyu Huang, Soroush Vosoughi, Jason Wei

We present COVID-Q, a set of 1, 690 questions about COVID-19 from 13 sources, which we annotate into 15 question categories and 207 question clusters.

Clustering General Classification +1

Enhanced Twitter Sentiment Classification Using Contextual Information

no code implementations WS 2015 Soroush Vosoughi, Helen Zhou, Deb Roy

This combined classifier outperforms the purely linguistic classifier, showing that integrating the rich contextual information available on Twitter into sentiment classification is a promising direction of research.

Classification General Classification +2

Tweet Acts: A Speech Act Classifier for Twitter

no code implementations17 May 2016 Soroush Vosoughi, Deb Roy

We created a taxonomy of six speech acts for Twitter and proposed a set of semantic and syntactic features.

General Classification Multi-class Classification +1

Automatic Detection and Categorization of Election-Related Tweets

no code implementations17 May 2016 Prashanth Vijayaraghavan, Soroush Vosoughi, Deb Roy

With the rise in popularity of public social media and micro-blogging services, most notably Twitter, the people have found a venue to hear and be heard by their peers without an intermediary.

Diversity

A Semi-automatic Method for Efficient Detection of Stories on Social Media

no code implementations17 May 2016 Soroush Vosoughi, Deb Roy

In this paper, we present a novel semi-automatic tool that enables users to efficiently identify and track stories about real-world events on Twitter.

Digital Stylometry: Linking Profiles Across Social Networks

no code implementations17 May 2016 Soroush Vosoughi, Helen Zhou, Deb Roy

There is an ever growing number of users with accounts on multiple social media and networking sites.

Cannot find the paper you are looking for? You can Submit a new open access paper.