no code implementations • NAACL 2022 • Prashanth Vijayaraghavan, Soroush Vosoughi
Our model relies on multi-view representations of the input tweet data to (a) extract different aspects of the input text including the context, entities, their relationships, and external knowledge; (b) model their mutual interplay; and (c) effectively speed up the learning process by requiring fewer training examples.
no code implementations • EMNLP 2020 • Weicheng Ma, Ruibo Liu, Lili Wang, Soroush Vosoughi
The lack of multi-label and aspect-level emoji prediction datasets is one of the bottlenecks for this task.
Multi-class Classification
Natural Language Understanding
+1
no code implementations • Findings (NAACL) 2022 • Ruibo Liu, Ge Zhang, Xinyu Feng, Soroush Vosoughi
Although current large-scale generative language models (LMs) can show impressive insights about factual knowledge, they do not exhibit similar success with respect to human values judgements (e. g., whether or not the generations of an LM are moral).
no code implementations • SemEval (NAACL) 2022 • Rishik Lad, Weicheng Ma, Soroush Vosoughi
This paper introduces the result of Team Dartmouth’s experiments on each of the five subtasks for the detection of sarcasm in English and Arabic tweets.
no code implementations • SemEval (NAACL) 2022 • Joseph Hajjar, Weicheng Ma, Soroush Vosoughi
This paper presents our approach for tackling SemEval-2022 Task 8: Multilingual News Article Similarity.
1 code implementation • 19 Feb 2025 • Chunhui Zhang, Yiren Jian, Zhongyu Ouyang, Soroush Vosoughi
Developing video captioning models is computationally expensive.
no code implementations • 13 Feb 2025 • Weicheng Ma, Hefan Zhang, Ivory Yang, Shiyu Ji, Joice Chen, Farnoosh Hashemi, Shubham Mohole, Ethan Gearey, Michael Macy, Saeed Hassanpour, Soroush Vosoughi
Large Language Models (LLMs) have shown proficiency in generating persuasive dialogue, yet concerns about the fluency and sophistication of their outputs persist.
1 code implementation • 9 Feb 2025 • Xingjian Diao, Chunhui Zhang, Weiyi Wu, Zhongyu Ouyang, Peijun Qing, Ming Cheng, Soroush Vosoughi, Jiang Gui
To overcome these challenges, we introduce a specialized cognitive module, temporal working memory (TWM), which aims to enhance the temporal modeling capabilities of MFMs.
1 code implementation • 27 Jan 2025 • Ivory Yang, Weicheng Ma, Chunhui Zhang, Soroush Vosoughi
Endangered languages, such as Navajo - the most widely spoken Native American language - are significantly underrepresented in contemporary language technologies, exacerbating the challenges of their preservation and revitalization.
1 code implementation • 29 Nov 2024 • Ivory Yang, Weicheng Ma, Soroush Vosoughi
The preservation and revitalization of endangered and extinct languages is a meaningful endeavor, conserving cultural heritage while enriching fields like linguistics and anthropology.
1 code implementation • 7 Nov 2024 • Yuxin Wang, Xiaomeng Zhu, Weimin Lyu, Saeed Hassanpour, Soroush Vosoughi
Handling implicit language is essential for natural language processing systems to achieve precise text understanding and facilitate natural interactions with users.
1 code implementation • 3 Nov 2024 • Alan Sun, Chiyu Ma, Kenneth Ge, Soroush Vosoughi
We present knowledge continuity, a novel definition inspired by Lipschitz continuity which aims to certify the robustness of neural networks across input domains (such as continuous and discrete domains in vision and language, respectively).
no code implementations • 1 Nov 2024 • Weicheng Ma, Luyang Zhao, Chun-Yi She, Yitao Jiang, Alan Sun, Bo Zhu, Devin Balkcom, Soroush Vosoughi
Recent large language models (LLMs) have demonstrated promising capabilities in modeling real-world knowledge and enhancing knowledge-based generation tasks.
1 code implementation • 28 Oct 2024 • Chiyu Ma, Jon Donnelly, Wenjun Liu, Soroush Vosoughi, Cynthia Rudin, Chaofan Chen
We present ProtoViT, a method for interpretable image classification combining deep learning and case-based reasoning.
1 code implementation • 19 Oct 2024 • Xiaobo Guo, Neil Potnis, Melody Yu, Nabeel Gillani, Soroush Vosoughi
This is also true in online discussion spaces like social media platforms.
1 code implementation • 14 Oct 2024 • Peijun Qing, Chongyang Gao, Yefan Zhou, Xingjian Diao, Yaoqing Yang, Soroush Vosoughi
However, inspired by the observed redundancy in traditional MoE structures, previous studies identify similar redundancy among LoRA experts within the MoE architecture, highlighting the necessity for non-uniform allocation of LoRA experts across different layers.
no code implementations • 14 Aug 2024 • Ivory Yang, Xiaobo Guo, Sean Xie, Soroush Vosoughi
This study presents a comprehensive, long-term project to explore the effectiveness of various prompting techniques in detecting dialogical mental manipulation.
no code implementations • 1 Jul 2024 • Maxwell Aladago, Lorenzo Torresani, Soroush Vosoughi
In the field of vision-language contrastive learning, models such as CLIP capitalize on matched image-caption pairs as positive examples and leverage within-batch non-matching pairs as negatives.
no code implementations • 23 Jun 2024 • Xiaobo Guo, Soroush Vosoughi
Large Language Models (LLMs) have shown remarkable capabilities in zero-shot learning applications, generating responses to queries using only pre-training information without the need for additional fine-tuning.
1 code implementation • 12 Jun 2024 • Lin Shi, Chiyu Ma, Wenhua Liang, Weicheng Ma, Soroush Vosoughi
LLM-as-a-Judge presents a promising alternative to human evaluators across various tasks, but inherent biases, especially position bias - a tendency to favor solutions based on their position in the prompt - have compromised its effectiveness.
no code implementations • 5 Jun 2024 • Xiaobo Guo, Soroush Vosoughi
The rapid proliferation of online content necessitates effective summarization methods, among which dynamic aspect-based summarization stands out.
1 code implementation • 26 May 2024 • Yuxin Wang, Ivory Yang, Saeed Hassanpour, Soroush Vosoughi
We anticipate that ${\rm M{\small ental}M{\small anip}}$ will stimulate further research, leading to progress in both understanding and mitigating the impact of mental manipulation in conversations.
1 code implementation • 16 Feb 2024 • Xiaobo Guo, Soroush Vosoughi
Aspect-based summarization has seen significant advancements, especially in structured text.
1 code implementation • 3 Nov 2023 • Sean Xie, Soroush Vosoughi, Saeed Hassanpour
Large Language Models (LLMs) have significantly advanced the field of Natural Language Processing (NLP), but their lack of interpretability has been a major concern.
1 code implementation • ICCV 2023 • Weiyi Wu, Chongyang Gao, Joseph DiPalma, Soroush Vosoughi, Saeed Hassanpour
This framework aims for transferable representation learning and semantically meaningful clustering by synergizing invariance loss and clustering loss in WSI analysis.
1 code implementation • 5 Oct 2023 • Yiren Jian, Tingkai Liu, Yunzhe Tao, Chunhui Zhang, Soroush Vosoughi, Hongxia Yang
Our experimental findings demonstrate that our approach accelerates the training of vision-language models by a factor of 5 without a noticeable impact on overall performance.
1 code implementation • NeurIPS 2023 • Yiren Jian, Chongyang Gao, Soroush Vosoughi
We present a novel methodology aimed at optimizing the application of frozen large language models (LLMs) for resource-intensive vision-language (VL) pre-training.
no code implementations • 1 Jun 2023 • Lili Wang, Chenghan Huang, Weicheng Ma, Xinyuan Cao, Soroush Vosoughi
We evaluate our proposed model on five publicly available datasets for the task of temporal graph similarity ranking, and our model outperforms baseline methods.
no code implementations • 1 Jun 2023 • Lili Wang, Chenghan Huang, Chongyang Gao, Weicheng Ma, Soroush Vosoughi
In the pursuit of accurate and scalable quantitative methods for financial market analysis, the focus has shifted from individual stock models to those capturing interrelations between companies and their stocks.
1 code implementation • 26 May 2023 • Ruibo Liu, Ruixin Yang, Chenyan Jia, Ge Zhang, Denny Zhou, Andrew M. Dai, Diyi Yang, Soroush Vosoughi
Social alignment in AI systems aims to ensure that these models behave according to established societal values.
1 code implementation • 13 Feb 2023 • Yiren Jian, Chongyang Gao, Chen Zeng, Yunjie Zhao, Soroush Vosoughi
Our findings indicate that the learned structural patterns of proteins can be transferred to RNAs, opening up potential new avenues for research.
no code implementations • 7 Feb 2023 • Xiaobo Guo, Weicheng Ma, Soroush Vosoughi
Differential framing of issues can lead to divergent world views on important issues.
no code implementations • 1 Jan 2023 • Ruibo Liu, Chenyan Jia, Ge Zhang, Ziyu Zhuang, Tony X Liu, Soroush Vosoughi
We present Second Thought, a new learning paradigm that enables language models (LMs) to re-align with human values.
no code implementations • 11 Oct 2022 • Ruibo Liu, Jason Wei, Shixiang Shane Gu, Te-Yen Wu, Soroush Vosoughi, Claire Cui, Denny Zhou, Andrew M. Dai
By training solely on written text, current language models (LMs) miss the grounded experience of humans in the real-world -- their failure to relate language to the physical world causes knowledge to be misrepresented and obvious mistakes in their reasoning.
4 code implementations • 6 Oct 2022 • Freda Shi, Mirac Suzgun, Markus Freitag, Xuezhi Wang, Suraj Srivats, Soroush Vosoughi, Hyung Won Chung, Yi Tay, Sebastian Ruder, Denny Zhou, Dipanjan Das, Jason Wei
Finally, we show that the multilingual reasoning abilities of language models extend to other tasks such as commonsense reasoning and word-in-context semantic judgment.
1 code implementation • 20 Sep 2022 • Yiren Jian, Chongyang Gao, Soroush Vosoughi
This indicates that Transformer models are able to generalize better by doing a similar task (i. e., clustering) with unpaired examples from different modalities in a multi-task fashion.
no code implementations • 13 Sep 2022 • Daniel DiPietro, Vivek Hazari, Soroush Vosoughi
Suicide is a major public health crisis.
no code implementations • 24 May 2022 • Yuansheng Xie, Soroush Vosoughi, Saeed Hassanpour
Machine learning (ML) models have been applied to a wide range of natural language processing (NLP) tasks in recent years.
1 code implementation • NAACL 2022 • Yiren Jian, Chongyang Gao, Soroush Vosoughi
Few-shot language learners adapt knowledge from a pre-trained model to recognize novel classes from a few-labeled sentences.
1 code implementation • NAACL 2022 • Yiren Jian, Chongyang Gao, Soroush Vosoughi
Following this line of work, we present a contrastive learning framework that clusters inputs from the same class for better generality of models trained with only limited examples.
1 code implementation • ICLR 2022 • Ruibo Liu, Chongyang Gao, Chenyan Jia, Guangxuan Xu, Soroush Vosoughi
The performance of existing text style transfer models is severely limited by the non-parallel datasets on which the models are trained.
1 code implementation • ICLR 2022 • Ruibo Liu, Guoqing Zheng, Shashank Gupta, Radhika Gaonkar, Chongyang Gao, Soroush Vosoughi, Milad Shokouhi, Ahmed Hassan Awadallah
Hence, they tend to suffer from counterfactual or hallucinatory generation when used in knowledge-intensive natural language generation (NLG) tasks.
Ranked #2 on
Question Answering
on KILT: ELI5
no code implementations • 30 Mar 2022 • Sean Xie, Soroush Vosoughi, Saeed Hassanpour
Artificial intelligence, particularly through recent advancements in deep learning, has achieved exceptional performances in many tasks in fields such as natural language processing and computer vision.
no code implementations • Findings (ACL) 2022 • Weicheng Ma, Samiha Datta, Lili Wang, Soroush Vosoughi
While cultural backgrounds have been shown to affect linguistic expressions, existing natural language processing (NLP) research on culture modeling is overly coarse-grained and does not examine cultural differences among speakers of the same language.
Cultural Vocal Bursts Intensity Prediction
Language Modeling
+6
no code implementations • 24 Jan 2022 • Xiaobo Guo, Yaojia Sun, Soroush Vosoughi
Our proposed model is different from other work in this area in that our model is based entirely on the emotional states, and the transition between these states of users on Reddit, whereas prior work is typically based on content-based representations (e. g., n-grams, language model embeddings, etc).
no code implementations • 14 Sep 2021 • Lili Wang, Chenghan Huang, Weicheng Ma, Ying Lu, Soroush Vosoughi
In this paper, we present a novel and flexible framework using stress majorization, to transform the high-dimensional role identities in networks directly (without approximation or indirect modeling) to a low-dimensional embedding space.
no code implementations • 14 Sep 2021 • Lili Wang, Chenghan Huang, Weicheng Ma, Xinyuan Cao, Soroush Vosoughi
Recent years have seen a rise in the development of representational learning methods for graph data.
no code implementations • EMNLP 2021 • Weicheng Ma, Renze Lou, Kai Zhang, Lili Wang, Soroush Vosoughi
Compared to AUTOSEM, a strong baseline method, GradTS improves the performance of MT-DNN with a bert-base-cased backend model, from 0. 33% to 17. 93% on 8 natural language understanding (NLU) tasks in the GLUE benchmarks.
no code implementations • ACL 2021 • Ruibo Liu, Jason Wei, Soroush Vosoughi
Although automated metrics are commonly used to evaluate NLG systems, they often correlate poorly with human judgements.
no code implementations • ACL 2021 • Weicheng Ma, Kai Zhang, Renze Lou, Lili Wang, Soroush Vosoughi
Through extensive experiments, we show that (1) pruning a number of attention heads in a multi-lingual Transformer-based model has, in general, positive effects on its performance in cross-lingual and multi-lingual tasks and (2) the attention heads to be pruned can be ranked using gradients and identified with a few trial experiments.
no code implementations • Findings (ACL) 2021 • Ruibo Liu, Jason Wei, Chenyan Jia, Soroush Vosoughi
Generating context-aware language that embodies diverse emotions is an important step towards building empathetic NLP systems.
no code implementations • 18 Jun 2021 • Lili Wang, Chongyang Gao, Chenghan Huang, Ruibo Liu, Weicheng Ma, Soroush Vosoughi
A common type of network is the heterogeneous network, where the nodes (and edges) can be of different types.
no code implementations • NAACL 2021 • Jason Wei, Kelly Finn, Emma Templeton, Thalia Wheatley, Soroush Vosoughi
The recent advent of online text-based therapy presents a new opportunity to analyze the complexity loss paradox in a novel operationalization: linguistic complexity loss in text-based therapy conversations.
1 code implementation • Findings (ACL) 2021 • Steven Y. Feng, Varun Gangal, Jason Wei, Sarath Chandar, Soroush Vosoughi, Teruko Mitamura, Eduard Hovy
In this paper, we present a comprehensive and unifying survey of data augmentation for NLP by summarizing the literature in a structured manner.
no code implementations • 30 Apr 2021 • Ruibo Liu, Chenyan Jia, Jason Wei, Guangxuan Xu, Lili Wang, Soroush Vosoughi
Current large-scale language models can be politically biased as a result of the data they are trained on, potentially causing serious problems when they are deployed in real-world settings.
1 code implementation • SEMEVAL 2021 • Aadil Islam, Weicheng Ma, Soroush Vosoughi
This paper describes a system submitted by team BigGreen to LCP 2021 for predicting the lexical complexity of English words in a given context.
1 code implementation • SEMEVAL 2021 • Yakoob Khan, Weicheng Ma, Soroush Vosoughi
This paper describes our approach to the Toxic Spans Detection problem (SemEval-2021 Task 5).
1 code implementation • NAACL 2021 • Jason Wei, Chengyu Huang, Soroush Vosoughi, Yu Cheng, Shiqi Xu
Few-shot text classification is a fundamental NLP task in which a model aims to classify text into a large number of categories, given only a few training examples per category.
1 code implementation • 11 Feb 2021 • Kang Gu, Soroush Vosoughi, Temiloluwa Prioleau
In recent years, there has been an ever increasing amount of multivariate time series (MTS) data in various domains, typically generated by a large family of sensors such as wearable devices.
no code implementations • EACL 2021 • Jason Wei, Chengyu Huang, Shiqi Xu, Soroush Vosoughi
Traditional data augmentation aims to increase the coverage of the input distribution by generating augmented examples that strongly resemble original samples in an online fashion where augmented examples dominate training.
no code implementations • 5 Jan 2021 • Ruibo Liu, Lili Wang, Chenyan Jia, Soroush Vosoughi
To detect polar words, we train a multi-attribute-aware word embedding model that is aware of ideology and topics on 360k full-length media articles.
no code implementations • 26 Dec 2020 • Neeti Pokhriyal, Abenezer Dara, Benjamin Valentino, Soroush Vosoughi
By using decadal data (2008-2019) from Reddit, we show that both monthly and daily estimates of CCI can, indeed, be reliably estimated at least several months in advance, and that our model estimates are far superior to those generated by the existing methods.
no code implementations • 24 Dec 2020 • Xiaobo Guo, Soroush Vosoughi
The prevalence of state-sponsored propaganda on the Internet has become a cause for concern in the recent years.
no code implementations • EMNLP (WNUT) 2020 • Lili Wang, Chongyang Gao, Jason Wei, Weicheng Ma, Ruibo Liu, Soroush Vosoughi
The field of NLP has seen unprecedented achievements in recent years.
no code implementations • ACL (unimplicit) 2021 • Weicheng Ma, Ruibo Liu, Lili Wang, Soroush Vosoughi
Finally, we clean up the improper or outdated annotations in one of the MD benchmark datasets and re-benchmark it with our Transformer-based model.
no code implementations • EMNLP (WNUT) 2020 • Dylan Whang, Soroush Vosoughi
We compared its performance to a suite of machine learning models.
no code implementations • EMNLP (WNUT) 2020 • Chris Miller, Soroush Vosoughi
Relation and event extraction is an important task in natural language processing.
no code implementations • EMNLP 2020 • Ruibo Liu, Guangxuan Xu, Chenyan Jia, Weicheng Ma, Lili Wang, Soroush Vosoughi
For instance, Data Boost improves F1 for the three tasks by 8. 7% on average when given only 10% of the whole data for training.
no code implementations • 5 Dec 2020 • Ruibo Liu, Guangxuan Xu, Soroush Vosoughi
In this work, we present Dager (Data Augmenter), a generation-based data augmentation method, that improves the performance of classification on imbalanced and low-resource data such as the offensive language dataset.
no code implementations • 3 Nov 2020 • Lili Wang, Ying Lu, Chenghan Huang, Soroush Vosoughi
However, the work on network embedding in hyperbolic space has been focused on microscopic node embedding.
no code implementations • 30 Sep 2020 • Weicheng Ma, Ruibo Liu, Li-Li Wang, Soroush Vosoughi
While other tasks based on linguistic style understanding benefit from deep learning methods, these methods have not behaved as well as traditional machine learning methods in many authorship-based tasks.
BIG-bench Machine Learning
Natural Language Understanding
+1
1 code implementation • 14 Jul 2020 • Weicheng Ma, Ruibo Liu, Lili Wang, Soroush Vosoughi
In this paper, we extend the existing setting of the emoji prediction task to include a richer set of emojis and to allow multi-label classification on the task.
no code implementations • 1 Jul 2020 • Chris Miller, Soroush Vosoughi
Deep neural networks are vulnerable to adversarial examples -- minor perturbations added to a model's input which cause the model to output an incorrect prediction.
no code implementations • 13 Jun 2020 • Lili Wang, Ruibo Liu, Soroush Vosoughi
Once trained on their accounts, users can have new photos sorted based on predicted engagement and style similarity to their previous work, thus enabling them to upload photos that not only have the potential to maximize engagement from their followers but also maintain their style of photography.
2 code implementations • ACL 2020 • Jerry Wei, Chengyu Huang, Soroush Vosoughi, Jason Wei
We present COVID-Q, a set of 1, 690 questions about COVID-19 from 13 sources, which we annotate into 15 question categories and 207 question clusters.
no code implementations • ACL 2017 • Prashanth Vijayaraghavan, Soroush Vosoughi, Deb Roy
In this paper, we present a demographic classifier for gender, age, political orientation and location on Twitter.
no code implementations • 26 Jul 2016 • Soroush Vosoughi, Prashanth Vijayaraghavan, Deb Roy
The vector representations generated by our model are generic, and hence can be applied to a variety of tasks.
no code implementations • SEMEVAL 2016 • Prashanth Vijayaraghavan, Ivan Sysoev, Soroush Vosoughi, Deb Roy
This paper describes our approach for the Detecting Stance in Tweets task (SemEval-2016 Task 6).
no code implementations • WS 2015 • Soroush Vosoughi, Helen Zhou, Deb Roy
This combined classifier outperforms the purely linguistic classifier, showing that integrating the rich contextual information available on Twitter into sentiment classification is a promising direction of research.
no code implementations • 17 May 2016 • Soroush Vosoughi, Deb Roy
We created a taxonomy of six speech acts for Twitter and proposed a set of semantic and syntactic features.
no code implementations • 17 May 2016 • Prashanth Vijayaraghavan, Soroush Vosoughi, Deb Roy
With the rise in popularity of public social media and micro-blogging services, most notably Twitter, the people have found a venue to hear and be heard by their peers without an intermediary.
no code implementations • 17 May 2016 • Soroush Vosoughi, Deb Roy
In this paper, we present a novel semi-automatic tool that enables users to efficiently identify and track stories about real-world events on Twitter.
no code implementations • 17 May 2016 • Soroush Vosoughi, Helen Zhou, Deb Roy
There is an ever growing number of users with accounts on multiple social media and networking sites.