Search Results for author: Soroush Vosoughi

Found 66 papers, 19 papers with code

Aligning Generative Language Models with Human Values

no code implementations Findings (NAACL) 2022 Ruibo Liu, Ge Zhang, Xinyu Feng, Soroush Vosoughi

Although current large-scale generative language models (LMs) can show impressive insights about factual knowledge, they do not exhibit similar success with respect to human values judgements (e. g., whether or not the generations of an LM are moral).

Text Generation Transfer Learning

TWEETSPIN: Fine-grained Propaganda Detection in Social Media Using Multi-View Representations

no code implementations NAACL 2022 Prashanth Vijayaraghavan, Soroush Vosoughi

Our model relies on multi-view representations of the input tweet data to (a) extract different aspects of the input text including the context, entities, their relationships, and external knowledge; (b) model their mutual interplay; and (c) effectively speed up the learning process by requiring fewer training examples.

Implicit Relations Logical Fallacies +1

Dartmouth at SemEval-2022 Task 6: Detection of Sarcasm

no code implementations SemEval (NAACL) 2022 Rishik Lad, Weicheng Ma, Soroush Vosoughi

This paper introduces the result of Team Dartmouth’s experiments on each of the five subtasks for the detection of sarcasm in English and Arabic tweets.

Data Augmentation Sarcasm Detection

Disordered-DABS: A Benchmark for Dynamic Aspect-Based Summarization in Disordered Texts

no code implementations16 Feb 2024 Xiaobo Guo, Soroush Vosoughi

Aspect-based summarization has seen significant advancements, especially in structured text.

Proto-lm: A Prototypical Network-Based Framework for Built-in Interpretability in Large Language Models

1 code implementation3 Nov 2023 Sean Xie, Soroush Vosoughi, Saeed Hassanpour

Large Language Models (LLMs) have significantly advanced the field of Natural Language Processing (NLP), but their lack of interpretability has been a major concern.

Improving Representation Learning for Histopathologic Images with Cluster Constraints

1 code implementation ICCV 2023 Weiyi Wu, Chongyang Gao, Joseph DiPalma, Soroush Vosoughi, Saeed Hassanpour

This framework aims for transferable representation learning and semantically meaningful clustering by synergizing invariance loss and clustering loss in WSI analysis.

Clustering Representation Learning +1

Expedited Training of Visual Conditioned Language Generation via Redundancy Reduction

1 code implementation5 Oct 2023 Yiren Jian, Tingkai Liu, Yunzhe Tao, Chunhui Zhang, Soroush Vosoughi, Hongxia Yang

Our experimental findings demonstrate that our approach accelerates the training of vision-language models by a factor of 5 without a noticeable impact on overall performance.

Representation Learning Text Generation

Bootstrapping Vision-Language Learning with Decoupled Language Pre-training

1 code implementation NeurIPS 2023 Yiren Jian, Chongyang Gao, Soroush Vosoughi

We present a novel methodology aimed at optimizing the application of frozen large language models (LLMs) for resource-intensive vision-language (VL) pre-training.

Graph-Level Embedding for Time-Evolving Graphs

no code implementations1 Jun 2023 Lili Wang, Chenghan Huang, Weicheng Ma, Xinyuan Cao, Soroush Vosoughi

We evaluate our proposed model on five publicly available datasets for the task of temporal graph similarity ranking, and our model outperforms baseline methods.

Anomaly Detection Graph Representation Learning +4

Joint Latent Topic Discovery and Expectation Modeling for Financial Markets

no code implementations1 Jun 2023 Lili Wang, Chenghan Huang, Chongyang Gao, Weicheng Ma, Soroush Vosoughi

In the pursuit of accurate and scalable quantitative methods for financial market analysis, the focus has shifted from individual stock models to those capturing interrelations between companies and their stocks.

Training Socially Aligned Language Models on Simulated Social Interactions

1 code implementation26 May 2023 Ruibo Liu, Ruixin Yang, Chenyan Jia, Ge Zhang, Denny Zhou, Andrew M. Dai, Diyi Yang, Soroush Vosoughi

Social alignment in AI systems aims to ensure that these models behave according to established societal values.

Knowledge from Large-Scale Protein Contact Prediction Models Can Be Transferred to the Data-Scarce RNA Contact Prediction Task

1 code implementation13 Feb 2023 Yiren Jian, Chongyang Gao, Chen Zeng, Yunjie Zhao, Soroush Vosoughi

Our findings indicate that the learned structural patterns of proteins can be transferred to RNAs, opening up potential new avenues for research.

Transfer Learning

Mind's Eye: Grounded Language Model Reasoning through Simulation

no code implementations11 Oct 2022 Ruibo Liu, Jason Wei, Shixiang Shane Gu, Te-Yen Wu, Soroush Vosoughi, Claire Cui, Denny Zhou, Andrew M. Dai

By training solely on written text, current language models (LMs) miss the grounded experience of humans in the real-world -- their failure to relate language to the physical world causes knowledge to be misrepresented and obvious mistakes in their reasoning.

Language Modelling

Language Models are Multilingual Chain-of-Thought Reasoners

2 code implementations6 Oct 2022 Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei

Finally, we show that the multilingual reasoning abilities of language models extend to other tasks such as commonsense reasoning and word-in-context semantic judgment.

GSM8K Math

Non-Linguistic Supervision for Contrastive Learning of Sentence Embeddings

1 code implementation20 Sep 2022 Yiren Jian, Chongyang Gao, Soroush Vosoughi

This indicates that Transformer models are able to generalize better by doing a similar task (i. e., clustering) with unpaired examples from different modalities in a multi-task fashion.

Clustering Contrastive Learning +3

Interpretation Quality Score for Measuring the Quality of interpretability methods

no code implementations24 May 2022 Yuansheng Xie, Soroush Vosoughi, Saeed Hassanpour

Machine learning (ML) models have been applied to a wide range of natural language processing (NLP) tasks in recent years.

Embedding Hallucination for Few-Shot Language Fine-tuning

1 code implementation NAACL 2022 Yiren Jian, Chongyang Gao, Soroush Vosoughi

Few-shot language learners adapt knowledge from a pre-trained model to recognize novel classes from a few-labeled sentences.

Data Augmentation Hallucination +1

Contrastive Learning for Prompt-Based Few-Shot Language Learners

1 code implementation NAACL 2022 Yiren Jian, Chongyang Gao, Soroush Vosoughi

Following this line of work, we present a contrastive learning framework that clusters inputs from the same class for better generality of models trained with only limited examples.

Contrastive Learning In-Context Learning +2

Non-Parallel Text Style Transfer with Self-Parallel Supervision

1 code implementation ICLR 2022 Ruibo Liu, Chongyang Gao, Chenyan Jia, Guangxuan Xu, Soroush Vosoughi

The performance of existing text style transfer models is severely limited by the non-parallel datasets on which the models are trained.

Imitation Learning Style Transfer +1

Knowledge Infused Decoding

1 code implementation ICLR 2022 Ruibo Liu, Guoqing Zheng, Shashank Gupta, Radhika Gaonkar, Chongyang Gao, Soroush Vosoughi, Milad Shokouhi, Ahmed Hassan Awadallah

Hence, they tend to suffer from counterfactual or hallucinatory generation when used in knowledge-intensive natural language generation (NLG) tasks.

counterfactual Question Answering +1

Towards Interpretable Deep Reinforcement Learning Models via Inverse Reinforcement Learning

no code implementations30 Mar 2022 Sean Xie, Soroush Vosoughi, Saeed Hassanpour

Artificial intelligence, particularly through recent advancements in deep learning, has achieved exceptional performances in many tasks in fields such as natural language processing and computer vision.

Decision Making reinforcement-learning +1

EnCBP: A New Benchmark Dataset for Finer-Grained Cultural Background Prediction in English

no code implementations Findings (ACL) 2022 Weicheng Ma, Samiha Datta, Lili Wang, Soroush Vosoughi

While cultural backgrounds have been shown to affect linguistic expressions, existing natural language processing (NLP) research on culture modeling is overly coarse-grained and does not examine cultural differences among speakers of the same language.

Cultural Vocal Bursts Intensity Prediction Language Modelling +5

Emotion-based Modeling of Mental Disorders on Social Media

no code implementations24 Jan 2022 Xiaobo Guo, Yaojia Sun, Soroush Vosoughi

Our proposed model is different from other work in this area in that our model is based entirely on the emotional states, and the transition between these states of users on Reddit, whereas prior work is typically based on content-based representations (e. g., n-grams, language model embeddings, etc).

Language Modelling

Embedding Node Structural Role Identity Using Stress Majorization

no code implementations14 Sep 2021 Lili Wang, Chenghan Huang, Weicheng Ma, Ying Lu, Soroush Vosoughi

In this paper, we present a novel and flexible framework using stress majorization, to transform the high-dimensional role identities in networks directly (without approximation or indirect modeling) to a low-dimensional embedding space.

Node Classification

GradTS: A Gradient-Based Automatic Auxiliary Task Selection Method Based on Transformer Networks

no code implementations EMNLP 2021 Weicheng Ma, Renze Lou, Kai Zhang, Lili Wang, Soroush Vosoughi

Compared to AUTOSEM, a strong baseline method, GradTS improves the performance of MT-DNN with a bert-base-cased backend model, from 0. 33% to 17. 93% on 8 natural language understanding (NLU) tasks in the GLUE benchmarks.

Multi-Task Learning Natural Language Understanding

Language Model Augmented Relevance Score

no code implementations ACL 2021 Ruibo Liu, Jason Wei, Soroush Vosoughi

Although automated metrics are commonly used to evaluate NLG systems, they often correlate poorly with human judgements.

Language Modelling nlg evaluation

Contributions of Transformer Attention Heads in Multi- and Cross-lingual Tasks

no code implementations ACL 2021 Weicheng Ma, Kai Zhang, Renze Lou, Lili Wang, Soroush Vosoughi

Through extensive experiments, we show that (1) pruning a number of attention heads in a multi-lingual Transformer-based model has, in general, positive effects on its performance in cross-lingual and multi-lingual tasks and (2) the attention heads to be pruned can be ranked using gradients and identified with a few trial experiments.

XLM-R

Modulating Language Models with Emotions

no code implementations Findings (ACL) 2021 Ruibo Liu, Jason Wei, Chenyan Jia, Soroush Vosoughi

Generating context-aware language that embodies diverse emotions is an important step towards building empathetic NLP systems.

Response Generation

Linguistic Complexity Loss in Text-Based Therapy

no code implementations NAACL 2021 Jason Wei, Kelly Finn, Emma Templeton, Thalia Wheatley, Soroush Vosoughi

The recent advent of online text-based therapy presents a new opportunity to analyze the complexity loss paradox in a novel operationalization: linguistic complexity loss in text-based therapy conversations.

A Survey of Data Augmentation Approaches for NLP

1 code implementation Findings (ACL) 2021 Steven Y. Feng, Varun Gangal, Jason Wei, Sarath Chandar, Soroush Vosoughi, Teruko Mitamura, Eduard Hovy

In this paper, we present a comprehensive and unifying survey of data augmentation for NLP by summarizing the literature in a structured manner.

Data Augmentation

Mitigating Political Bias in Language Models Through Reinforced Calibration

no code implementations30 Apr 2021 Ruibo Liu, Chenyan Jia, Jason Wei, Guangxuan Xu, Lili Wang, Soroush Vosoughi

Current large-scale language models can be politically biased as a result of the data they are trained on, potentially causing serious problems when they are deployed in real-world settings.

reinforcement-learning Reinforcement Learning (RL) +1

BigGreen at SemEval-2021 Task 1: Lexical Complexity Prediction with Assembly Models

1 code implementation SEMEVAL 2021 Aadil Islam, Weicheng Ma, Soroush Vosoughi

This paper describes a system submitted by team BigGreen to LCP 2021 for predicting the lexical complexity of English words in a given context.

Feature Engineering Lexical Complexity Prediction

Few-Shot Text Classification with Triplet Networks, Data Augmentation, and Curriculum Learning

1 code implementation NAACL 2021 Jason Wei, Chengyu Huang, Soroush Vosoughi, Yu Cheng, Shiqi Xu

Few-shot text classification is a fundamental NLP task in which a model aims to classify text into a large number of categories, given only a few training examples per category.

Data Augmentation Few-Shot Text Classification +2

Feature Selection for Multivariate Time Series via Network Pruning

1 code implementation11 Feb 2021 Kang Gu, Soroush Vosoughi, Temiloluwa Prioleau

In recent years, there has been an ever increasing amount of multivariate time series (MTS) data in various domains, typically generated by a large family of sensors such as wearable devices.

feature selection Network Pruning +2

Text Augmentation in a Multi-Task View

no code implementations EACL 2021 Jason Wei, Chengyu Huang, Shiqi Xu, Soroush Vosoughi

Traditional data augmentation aims to increase the coverage of the input distribution by generating augmented examples that strongly resemble original samples in an online fashion where augmented examples dominate training.

Text Augmentation text-classification +1

Political Depolarization of News Articles Using Attribute-aware Word Embeddings

no code implementations5 Jan 2021 Ruibo Liu, Lili Wang, Chenyan Jia, Soroush Vosoughi

To detect polar words, we train a multi-attribute-aware word embedding model that is aware of ideology and topics on 360k full-length media articles.

Attribute Text Generation +1

Social media data reveals signal for public consumer perceptions

no code implementations26 Dec 2020 Neeti Pokhriyal, Abenezer Dara, Benjamin Valentino, Soroush Vosoughi

By using decadal data (2008-2019) from Reddit, we show that both monthly and daily estimates of CCI can, indeed, be reliably estimated at least several months in advance, and that our model estimates are far superior to those generated by the existing methods.

Multi-modal Identification of State-Sponsored Propaganda on Social Media

no code implementations24 Dec 2020 Xiaobo Guo, Soroush Vosoughi

The prevalence of state-sponsored propaganda on the Internet has become a cause for concern in the recent years.

Improvements and Extensions on Metaphor Detection

no code implementations ACL (unimplicit) 2021 Weicheng Ma, Ruibo Liu, Lili Wang, Soroush Vosoughi

Finally, we clean up the improper or outdated annotations in one of the MD benchmark datasets and re-benchmark it with our Transformer-based model.

Natural Language Understanding

Enhanced Offensive Language Detection Through Data Augmentation

no code implementations5 Dec 2020 Ruibo Liu, Guangxuan Xu, Soroush Vosoughi

In this work, we present Dager (Data Augmenter), a generation-based data augmentation method, that improves the performance of classification on imbalanced and low-resource data such as the offensive language dataset.

Data Augmentation Task 2

Embedding Node Structural Role Identity into Hyperbolic Space

no code implementations3 Nov 2020 Lili Wang, Ying Lu, Chenghan Huang, Soroush Vosoughi

However, the work on network embedding in hyperbolic space has been focused on microscopic node embedding.

Network Embedding

Towards Improved Model Design for Authorship Identification: A Survey on Writing Style Understanding

no code implementations30 Sep 2020 Weicheng Ma, Ruibo Liu, Li-Li Wang, Soroush Vosoughi

While other tasks based on linguistic style understanding benefit from deep learning methods, these methods have not behaved as well as traditional machine learning methods in many authorship-based tasks.

BIG-bench Machine Learning Natural Language Understanding

Emoji Prediction: Extensions and Benchmarking

1 code implementation14 Jul 2020 Weicheng Ma, Ruibo Liu, Lili Wang, Soroush Vosoughi

In this paper, we extend the existing setting of the emoji prediction task to include a richer set of emojis and to allow multi-label classification on the task.

Benchmarking Multi-Label Classification

Query-Free Adversarial Transfer via Undertrained Surrogates

no code implementations1 Jul 2020 Chris Miller, Soroush Vosoughi

Deep neural networks are vulnerable to adversarial examples -- minor perturbations added to a model's input which cause the model to output an incorrect prediction.

Adversarial Attack

Salienteye: Maximizing Engagement While Maintaining Artistic Style on Instagram Using Deep Neural Networks

no code implementations13 Jun 2020 Lili Wang, Ruibo Liu, Soroush Vosoughi

Once trained on their accounts, users can have new photos sorted based on predicted engagement and style similarity to their previous work, thus enabling them to upload photos that not only have the potential to maximize engagement from their followers but also maintain their style of photography.

Object Recognition Transfer Learning

What Are People Asking About COVID-19? A Question Classification Dataset

2 code implementations ACL 2020 Jerry Wei, Chengyu Huang, Soroush Vosoughi, Jason Wei

We present COVID-Q, a set of 1, 690 questions about COVID-19 from 13 sources, which we annotate into 15 question categories and 207 question clusters.

Clustering General Classification

Enhanced Twitter Sentiment Classification Using Contextual Information

no code implementations WS 2015 Soroush Vosoughi, Helen Zhou, Deb Roy

This combined classifier outperforms the purely linguistic classifier, showing that integrating the rich contextual information available on Twitter into sentiment classification is a promising direction of research.

Classification General Classification +2

Tweet Acts: A Speech Act Classifier for Twitter

no code implementations17 May 2016 Soroush Vosoughi, Deb Roy

We created a taxonomy of six speech acts for Twitter and proposed a set of semantic and syntactic features.

General Classification Multi-class Classification +1

Automatic Detection and Categorization of Election-Related Tweets

no code implementations17 May 2016 Prashanth Vijayaraghavan, Soroush Vosoughi, Deb Roy

With the rise in popularity of public social media and micro-blogging services, most notably Twitter, the people have found a venue to hear and be heard by their peers without an intermediary.

A Semi-automatic Method for Efficient Detection of Stories on Social Media

no code implementations17 May 2016 Soroush Vosoughi, Deb Roy

In this paper, we present a novel semi-automatic tool that enables users to efficiently identify and track stories about real-world events on Twitter.

Digital Stylometry: Linking Profiles Across Social Networks

no code implementations17 May 2016 Soroush Vosoughi, Helen Zhou, Deb Roy

There is an ever growing number of users with accounts on multiple social media and networking sites.

Cannot find the paper you are looking for? You can Submit a new open access paper.