no code implementations • Findings (EMNLP) 2021 • Kezhen Chen, Qiuyuan Huang, Daniel McDuff, Xiang Gao, Hamid Palangi, JianFeng Wang, Kenneth Forbus, Jianfeng Gao
Based on these annotations, we define two different tasks for the NICE dataset.
no code implementations • 7 Jul 2025 • Ashima Suvarna, Christina Chance, Hamid Palangi, Sophie Hao, Thomas Hartvigsen, Saadia Gabriel
Automatic toxic language detection is critical for creating safe, inclusive online spaces.
1 code implementation • 9 Jun 2025 • Ken Gu, Zhihan Zhang, Kate Lin, Yuwei Zhang, Akshay Paruchuri, Hong Yu, Mehran Kazemi, Kumar Ayush, A. Ali Heydari, Maxwell A. Xu, Girish Narayanswamy, Yun Liu, Ming-Zher Poh, Yuzhe Yang, Mark Malhotra, Shwetak Patel, Hamid Palangi, Xuhai Xu, Daniel McDuff, Tim Althoff, Xin Liu
Language models (LMs) are increasingly being deployed to perform autonomous data analyses.
no code implementations • 15 Apr 2025 • Salman Rahman, Liwei Jiang, James Shiffer, Genglin Liu, Sheriff Issaka, Md Rizwan Parvez, Hamid Palangi, Kai-Wei Chang, Yejin Choi, Saadia Gabriel
Multi-turn interactions with language models (LMs) pose critical safety risks, as harmful intent can be strategically spread across exchanges.
no code implementations • 2 Apr 2025 • Souradip Chakraborty, Mohammadreza Pourreza, Ruoxi Sun, Yiwen Song, Nino Scherrer, Jindong Gu, Furong Huang, Amrit Singh Bedi, Ahmad Beirami, Hamid Palangi, Tomas Pfister
Hence, we propose Iterative Agent Decoding (IAD) which combines iterative refinement with dynamic candidate evaluation and selection guided by a verifier.
no code implementations • 1 Apr 2025 • Wenjun Zeng, Dana Kurniawan, Ryan Mullins, Yuchi Liu, Tamoghna Saha, Dirichi Ike-Njoku, Jindong Gu, Yiwen Song, Cai Xu, Jingjing Zhou, Aparna Joshi, Shravan Dheep, Mani Malek, Hamid Palangi, Joon Baek, Rick Pereira, Karthik Narasimhan
We introduce ShieldGemma 2, a 4B parameter image content moderation model built on Gemma 3.
no code implementations • 11 Mar 2025 • Zhen Tan, Jun Yan, I-Hung Hsu, Rujun Han, Zifeng Wang, Long T. Le, Yiwen Song, Yanfei Chen, Hamid Palangi, George Lee, Anand Iyer, Tianlong Chen, Huan Liu, Chen-Yu Lee, Tomas Pfister
Large Language Models (LLMs) have made significant progress in open-ended dialogue, yet their inability to retain and retrieve relevant information from long-term interactions limits their effectiveness in applications requiring sustained personalization.
no code implementations • 10 Mar 2025 • Fan Yin, Zifeng Wang, I-Hung Hsu, Jun Yan, Ke Jiang, Yanfei Chen, Jindong Gu, Long T. Le, Kai-Wei Chang, Chen-Yu Lee, Hamid Palangi, Tomas Pfister
To address this, we propose Magnet, a principled framework for synthesizing high-quality training trajectories to enhance the function calling capability of large language model agents in multi-turn conversations with humans.
1 code implementation • 25 Feb 2025 • Tushar Aggarwal, Kumar Tanmay, Ayush Agrawal, Kumar Ayush, Hamid Palangi, Paul Pu Liang
Multilingual language models (LMs) are expected to recall factual knowledge consistently across languages, yet they often fail to transfer knowledge between languages even when they possess the correct information in one of the languages.
no code implementations • 22 Feb 2025 • Mihir Parmar, Xin Liu, Palash Goyal, Yanfei Chen, Long Le, Swaroop Mishra, Hossein Mobahi, Jindong Gu, Zifeng Wang, Hootan Nakhost, Chitta Baral, Chen-Yu Lee, Tomas Pfister, Hamid Palangi
Recent agent frameworks and inference-time algorithms often struggle with complex planning problems due to limitations in verifying generated plans or reasoning and varying complexity of instances within a single task.
no code implementations • 6 Feb 2025 • Shangbin Feng, Zifeng Wang, Palash Goyal, Yike Wang, Weijia Shi, Huang Xia, Hamid Palangi, Luke Zettlemoyer, Yulia Tsvetkov, Chen-Yu Lee, Tomas Pfister
We propose Heterogeneous Swarms, an algorithm to design multi-LLM systems by jointly optimizing model roles and weights.
no code implementations • 4 Feb 2025 • Han Zhou, Xingchen Wan, Ruoxi Sun, Hamid Palangi, Shariq Iqbal, Ivan Vulić, Anna Korhonen, Sercan Ö. Arik
Large language models, employed as multiple agents that interact and collaborate with each other, have excelled at solving complex tasks.
no code implementations • 29 Nov 2024 • Justin Chih-Yao Chen, Zifeng Wang, Hamid Palangi, Rujun Han, Sayna Ebrahimi, Long Le, Vincent Perot, Swaroop Mishra, Mohit Bansal, Chen-Yu Lee, Tomas Pfister
Reverse thinking plays a crucial role in human reasoning.
no code implementations • 15 Oct 2024 • Shangbin Feng, Zifeng Wang, Yike Wang, Sayna Ebrahimi, Hamid Palangi, Lesly Miculicich, Achin Kulshrestha, Nathalie Rauschmayr, Yejin Choi, Yulia Tsvetkov, Chen-Yu Lee, Tomas Pfister
Extensive experiments demonstrate that Model Swarms could flexibly adapt LLM experts to a single task, multi-task domains, reward models, as well as diverse human interests, improving over 12 model composition baselines by up to 21. 0% across tasks and contexts.
no code implementations • 26 Sep 2024 • Elliot L. Epstein, Kaisheng Yao, Jing Li, Xinyi Bai, Hamid Palangi
When all the instructions are also appended to the end of the model input context, the $\operatorname{PIF}$ metric improves by 22. 3 points on average, showing that the challenge with the task lies not only in following the instructions, but also in retrieving the instructions spread out in the model context.
1 code implementation • 13 Sep 2024 • Vidhisha Balachandran, Jingya Chen, Neel Joshi, Besmira Nushi, Hamid Palangi, Eduardo Salinas, Vibhav Vineet, James Woffinden-Luey, Safoora Yousefi
Second, we introduce Eureka-Bench as an extensible collection of benchmarks testing capabilities that (i) are still challenging for state-of-the-art models and (ii) represent fundamental but overlooked language and multimodal capabilities.
1 code implementation • 13 Feb 2024 • Kyle O'Brien, Nathan Ng, Isha Puri, Jorge Mendez, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi, Thomas Hartvigsen
Machine learning models for text classification often excel on in-distribution (ID) data but struggle with unseen out-of-distribution (OOD) inputs.
no code implementations • 9 Feb 2024 • Shima Imani, Hamid Palangi
Large Language Models (LLMs) have demonstrated impressive performance across a wide range of applications; however, assessing their reasoning capabilities remains a significant challenge.
1 code implementation • 4 Dec 2023 • Giovanni Monea, Maxime Peyrard, Martin Josifoski, Vishrav Chaudhary, Jason Eisner, Emre Kiciman, Hamid Palangi, Barun Patra, Robert West
We present a novel method to study grounding abilities using Fakepedia, a novel dataset of counterfactual texts constructed to clash with a model's internal parametric knowledge.
no code implementations • 18 Nov 2023 • Arindam Mitra, Luciano del Corro, Shweti Mahajan, Andres Codas, Clarisse Simoes, Sahaj Agarwal, Xuxi Chen, Anastasia Razdaibiedina, Erik Jones, Kriti Aggarwal, Hamid Palangi, Guoqing Zheng, Corby Rosset, Hamed Khanpour, Ahmed Awadallah
Research on training small LMs has often relied on imitation learning to replicate the output of more capable models.
Ranked #1 on
Crass AI
on BIG-bench
no code implementations • 26 Oct 2023 • Ahmed Magooda, Alec Helyar, Kyle Jackson, David Sullivan, Chad Atalla, Emily Sheng, Dan Vann, Richard Edgar, Hamid Palangi, Roman Lutz, Hongliang Kong, Vincent Yun, Eslam Kamal, Federico Zarfati, Hanna Wallach, Sarah Bird, Mei Chen
We present a framework for the automated measurement of responsible AI (RAI) metrics for large language models (LLMs) and associated products and services.
no code implementations • 11 Oct 2023 • Ranjita Naik, Varun Chandrasekaran, Mert Yuksekgonul, Hamid Palangi, Besmira Nushi
Large language models (LLMs) are documented to struggle in settings that require complex reasoning.
no code implementations • 10 Oct 2023 • Erik Jones, Hamid Palangi, Clarisse Simões, Varun Chandrasekaran, Subhabrata Mukherjee, Arindam Mitra, Ahmed Awadallah, Ece Kamar
We also find that optimizing the system message rather than the model weights can be critical; fine-tuning the entire model on the synthetic task can counterintuitively increase hallucination.
1 code implementation • 26 Sep 2023 • Mert Yuksekgonul, Varun Chandrasekaran, Erik Jones, Suriya Gunasekar, Ranjita Naik, Hamid Palangi, Ece Kamar, Besmira Nushi
We investigate the internal behavior of Transformer-based Large Language Models (LLMs) when they generate factually incorrect text.
no code implementations • 20 Jul 2023 • Somayeh Ghanbarzadeh, Yan Huang, Hamid Palangi, Radames Cruz Moreno, Hamed Khanpour
Recent studies have revealed that the widely-used Pre-trained Language Models (PLMs) propagate societal biases from the large unmoderated pre-training corpora.
no code implementations • 19 Jul 2023 • Somayeh Ghanbarzadeh, Hamid Palangi, Yan Huang, Radames Cruz Moreno, Hamed Khanpour
The reusability of state-of-the-art Pre-trained Language Models (PLMs) is often limited by their generalization problem, where their performance drastically decreases when evaluated on examples that differ from the training dataset, known as Out-of-Distribution (OOD)/unseen examples.
4 code implementations • 5 Jun 2023 • Subhabrata Mukherjee, Arindam Mitra, Ganesh Jawahar, Sahaj Agarwal, Hamid Palangi, Ahmed Awadallah
To address these challenges, we develop Orca (We are working with our legal team to publicly release a diff of the model weights in accordance with LLaMA's release policy to be published at https://aka. ms/orca-lm), a 13-billion parameter model that learns to imitate the reasoning process of LFMs.
no code implementations • 8 Apr 2023 • Yu Yang, Besmira Nushi, Hamid Palangi, Baharan Mirzasoleiman
Spurious correlations that degrade model generalization or lead the model to be right for the wrong reasons are one of the main robustness concerns for real-world deployments.
3 code implementations • 22 Mar 2023 • Sébastien Bubeck, Varun Chandrasekaran, Ronen Eldan, Johannes Gehrke, Eric Horvitz, Ece Kamar, Peter Lee, Yin Tat Lee, Yuanzhi Li, Scott Lundberg, Harsha Nori, Hamid Palangi, Marco Tulio Ribeiro, Yi Zhang
We contend that (this early version of) GPT-4 is part of a new cohort of LLMs (along with ChatGPT and Google's PaLM for example) that exhibit more general intelligence than previous AI models.
Ranked #36 on
Arithmetic Reasoning
on GSM8K
1 code implementation • 22 Jan 2023 • Saghar Hosseini, Hamid Palangi, Ahmed Hassan Awadallah
Large-scale Pre-Trained Language Models (PTLMs) capture knowledge from massive human-written data which contains latent societal biases and toxic contents.
no code implementations • CVPR 2023 • Madeline Chantry Schiappa, Naman Biyani, Prudvi Kamtam, Shruti Vyas, Hamid Palangi, Vibhav Vineet, Yogesh S. Rawat
In this work, we perform a large-scale robustness analysis of these existing models for video action recognition.
1 code implementation • 20 Dec 2022 • Tejas Gokhale, Hamid Palangi, Besmira Nushi, Vibhav Vineet, Eric Horvitz, Ece Kamar, Chitta Baral, Yezhou Yang
We investigate the ability of T2I models to generate correct spatial relationships among objects and present VISOR, an evaluation metric that captures how accurately the spatial relationship described in text is generated in the image.
1 code implementation • NeurIPS 2023 • Thomas Hartvigsen, Swami Sankaranarayanan, Hamid Palangi, Yoon Kim, Marzyeh Ghassemi
We propose GRACE, a lifelong model editing method, which implements spot-fixes on streaming errors of a deployed model, ensuring minimal impact on unrelated inputs.
1 code implementation • 20 Nov 2022 • Abdelrahman Zayed, Prasanna Parthasarathi, Goncalo Mordido, Hamid Palangi, Samira Shabanian, Sarath Chandar
The fairness achieved by our method surpasses that of data augmentation on three text classification datasets, using no more than half of the examples in the augmented dataset.
no code implementations • 8 Nov 2022 • Saadia Gabriel, Hamid Palangi, Yejin Choi
While a substantial body of prior work has explored adversarial example generation for natural language understanding tasks, these examples are often unrealistic and diverge from the real-world data distributions.
no code implementations • MTSummit 2021 • Paul Soulos, Sudha Rao, Caitlin Smith, Eric Rosen, Asli Celikyilmaz, R. Thomas McCoy, Yichen Jiang, Coleman Haley, Roland Fernandez, Hamid Palangi, Jianfeng Gao, Paul Smolensky
Machine translation has seen rapid progress with the advent of Transformer-based models.
1 code implementation • 5 Jul 2022 • Madeline C. Schiappa, Shruti Vyas, Hamid Palangi, Yogesh S. Rawat, Vibhav Vineet
Joint visual and language modeling on large-scale datasets has recently shown good progress in multi-modal tasks when compared to single modal learning.
1 code implementation • 4 Jul 2022 • Madeline Chantry Schiappa, Naman Biyani, Prudvi Kamtam, Shruti Vyas, Hamid Palangi, Vibhav Vineet, Yogesh Rawat
In this work, we perform a large-scale robustness analysis of these existing models for video action recognition.
1 code implementation • ACL 2022 • Thomas Hartvigsen, Saadia Gabriel, Hamid Palangi, Maarten Sap, Dipankar Ray, Ece Kamar
To help mitigate these issues, we create ToxiGen, a new large-scale and machine-generated dataset of 274k toxic and benign statements about 13 minority groups.
1 code implementation • NAACL 2021 • Yichen Jiang, Asli Celikyilmaz, Paul Smolensky, Paul Soulos, Sudha Rao, Hamid Palangi, Roland Fernandez, Caitlin Smith, Mohit Bansal, Jianfeng Gao
On several syntactic and semantic probing tasks, we demonstrate the emergent structural information in the role vectors and improved syntactic interpretability in the TPR layer outputs.
1 code implementation • 19 May 2021 • Jacob Russin, Roland Fernandez, Hamid Palangi, Eric Rosen, Nebojsa Jojic, Paul Smolensky, Jianfeng Gao
A longstanding question in cognitive science concerns the learning mechanisms underlying compositionality in human cognition.
1 code implementation • 18 Nov 2020 • Hassan Akbari, Hamid Palangi, Jianwei Yang, Sudha Rao, Asli Celikyilmaz, Roland Fernandez, Paul Smolensky, Jianfeng Gao, Shih-Fu Chang
In this paper, we propose a new model architecture for learning multi-modal neuro-symbolic representations for video captioning.
no code implementations • ICML 2020 • Saeed Amizadeh, Hamid Palangi, Oleksandr Polozov, Yichen Huang, Kazuhito Koishida
To address this, we propose (1) a framework to isolate and evaluate the reasoning aspect of VQA separately from its perception, and (2) a novel top-down calibration technique that allows the model to answer reasoning questions even with imperfect perception.
no code implementations • 22 May 2020 • Yuhang Song, Wenbo Li, Lei Zhang, Jianwei Yang, Emre Kiciman, Hamid Palangi, Jianfeng Gao, C. -C. Jay Kuo, Pengchuan Zhang
We study in this paper the problem of novel human-object interaction (HOI) detection, aiming at improving the generalization ability of the model to unseen scenarios.
1 code implementation • 25 Oct 2019 • Mehrad Moradshahi, Hamid Palangi, Monica S. Lam, Paul Smolensky, Jianfeng Gao
We introduce HUBERT which combines the structured-representational power of Tensor-Product Representations (TPRs) and BERT, a pre-trained bidirectional Transformer language model.
2 code implementations • ICML 2020 • Kezhen Chen, Qiuyuan Huang, Hamid Palangi, Paul Smolensky, Kenneth D. Forbus, Jianfeng Gao
The encoder of TP-N2F employs TPR `binding' to encode natural-language symbolic structure in vector space and the decoder uses TPR `unbinding' to generate, in symbolic space, a sequential program represented by relational tuples, each consisting of a relation (or operation) and a number of arguments.
no code implementations • 25 Sep 2019 • Kezhen Chen, Qiuyuan Huang, Hamid Palangi, Paul Smolensky, Kenneth D. Forbus, Jianfeng Gao
Generating formal-language represented by relational tuples, such as Lisp programs or mathematical expressions, from a natural-language input is an extremely challenging task because it requires to explicitly capture discrete symbolic structural information from the input to generate the output.
3 code implementations • 24 Sep 2019 • Luowei Zhou, Hamid Palangi, Lei Zhang, Houdong Hu, Jason J. Corso, Jianfeng Gao
The model is unified in that (1) it can be fine-tuned for either vision-language generation (e. g., image captioning) or understanding (e. g., visual question answering) tasks, and (2) it uses a shared multi-layer transformer network for both encoding and decoding, which differs from many existing methods where the encoder and decoder are implemented using separate models.
Ranked #1 on
Image Captioning
on Flickr30k Captions test
no code implementations • 22 Sep 2019 • Kuang-Huei Lee, Hamid Palangi, Xi Chen, Houdong Hu, Jianfeng Gao
In this work, we tackle two fundamental language-and-vision tasks: image-text matching and image captioning, and demonstrate that neural scene graph generators can learn effective visual relation features to facilitate grounding language to visual relations and subsequently improve the two end applications.
no code implementations • 23 May 2017 • Hamid Palangi, Paul Smolensky, Xiaodong He, Li Deng
In our application of TPRN, internal representations learned by end-to-end optimization in a deep neural network performing a textual question-answering (QA) task can be interpreted using basic concepts from linguistic theory.
no code implementations • 20 Aug 2015 • Hamid Palangi, Rabab Ward, Li Deng
As the proposed method is a data driven method, it is only applicable when training data is available.
no code implementations • 24 Feb 2015 • Hamid Palangi, Li Deng, Yelong Shen, Jianfeng Gao, Xiaodong He, Jianshu Chen, Xinying Song, Rabab Ward
The results show that the proposed method in this paper significantly outperforms it for web document retrieval task.
no code implementations • 13 Nov 2013 • Hamid Palangi, Li Deng, Rabab K. Ward
In this paper, we devise a special technique that take advantage of this linearity in the output units of an ESN, to learn the input and recurrent matrices.