no code implementations • NAACL (ACL) 2022 • Vivek Kulkarni, Kenny Leung, Aria Haghighi
In contrast to most prior work which only focuses on post-classification into a small number of topics (10-20), we consider the task of large-scale topic classification in the context of Twitter where the topic space is 10 times larger with potentially multiple topic associations per Tweet.
no code implementations • 14 Oct 2022 • Shubhanshu Mishra, Aman Saini, Raheleh Makki, Sneha Mehta, Aria Haghighi, Ali Mollahosseini
Named Entity Recognition and Disambiguation (NERD) systems are foundational for information retrieval, question answering, event detection, and other natural language processing (NLP) applications.
no code implementations • 12 May 2022 • Ahmed El-Kishky, Thomas Markovich, Kenny Leung, Frank Portman, Aria Haghighi, Ying Xiao
To this end, we introduce kNN-Embed, a general approach to improving diversity in dense ANN-based retrieval.
no code implementations • 3 May 2022 • Vivek Kulkarni, Kenny Leung, Aria Haghighi
In contrast to most prior work which only focuses on post classification into a small number of topics ($10$-$20$), we consider the task of large-scale topic classification in the context of Twitter where the topic space is $10$ times larger with potentially multiple topic associations per Tweet.
1 code implementation • 27 Jan 2022 • John Pougué-Biyong, Akshay Gupta, Aria Haghighi, Ahmed El-Kishky
We propose the Stance Embeddings Model(SEM), which jointly learns embeddings for each user and topic in signed social graphs with distinct edge types for each topic.
1 code implementation • Findings (EMNLP) 2021 • Vivek Kulkarni, Shubhanshu Mishra, Aria Haghighi
Although language depends heavily on the geographical, temporal, and other social contexts of the speaker, these elements have not been incorporated into modern transformer-based language models.
1 code implementation • WNUT (ACL) 2021 • Shubhanshu Mishra, Aria Haghighi
We evaluate a simple approach to improving zero-shot multilingual transfer of mBERT on social media corpus by adding a pretraining task called translation pair prediction (TPP), which predicts whether a pair of cross-lingual texts are a valid translation.