no code implementations • Findings (EMNLP) 2021 • Jad Kabbara, Jackie Chi Kit Cheung
Moreover, based on an automatic evaluation study, we provide evidence for our system’s ability to generate linguistic decisions that lead to improved extractive summaries.
no code implementations • COLING 2022 • Jad Kabbara, Jackie Chi Kit Cheung
Presuppositions are assumptions that are taken for granted by an utterance, and identifying them is key to a pragmatic interpretation of language.
1 code implementation • 28 Dec 2024 • Shrestha Mohanty, Sarah Xuan, Jacob Jobraeel, Anurag Kumar, Deb Roy, Jad Kabbara
To address this, we explore how Large Language Models (LLMs) can enrich these excerpts by providing socially relevant context.
no code implementations • 19 Dec 2024 • Shayne Longpre, Nikhil Singh, Manuel Cherep, Kushagra Tiwary, Joanna Materzynska, William Brannon, Robert Mahari, Manan Dey, Mohammed Hamdy, Nayan Saxena, Ahmad Mustafa Anis, Emad A. Alghamdi, Vu Minh Chien, Naana Obeng-Marnu, Da Yin, Kun Qian, Yizhi Li, Minnie Liang, An Dinh, Shrestha Mohanty, Deividas Mataciunas, Tobin South, JianGuo Zhang, Ariel N. Lee, Campbell S. Lund, Christopher Klamm, Damien Sileo, Diganta Misra, Enrico Shippole, Kevin Klyman, Lester JV Miranda, Niklas Muennighoff, Seonghyeon Ye, Seungone Kim, Vipul Gupta, Vivek Sharma, Xuhui Zhou, Caiming Xiong, Luis Villa, Stella Biderman, Alex Pentland, Sara Hooker, Jad Kabbara
In this work we conduct the largest and first-of-its-kind longitudinal audit across modalities--popular text, speech, and video datasets--from their detailed sourcing trends and use restrictions to their geographical and linguistic representation.
1 code implementation • 9 Sep 2024 • Suyash Fulay, William Brannon, Shrestha Mohanty, Cassandra Overney, Elinor Poole-Dayan, Deb Roy, Jad Kabbara
In this work, we focus on analyzing the relationship between two concepts essential in both language model alignment and political science: truthfulness and political bias.
no code implementations • 20 Jul 2024 • Shayne Longpre, Robert Mahari, Ariel Lee, Campbell Lund, Hamidah Oderinwale, William Brannon, Nayan Saxena, Naana Obeng-Marnu, Tobin South, Cole Hunter, Kevin Klyman, Christopher Klamm, Hailey Schoelkopf, Nikhil Singh, Manuel Cherep, Ahmad Anis, An Dinh, Caroline Chitongo, Da Yin, Damien Sileo, Deividas Mataciunas, Diganta Misra, Emad Alghamdi, Enrico Shippole, JianGuo Zhang, Joanna Materzynska, Kun Qian, Kush Tiwary, Lester Miranda, Manan Dey, Minnie Liang, Mohammed Hamdy, Niklas Muennighoff, Seonghyeon Ye, Seungone Kim, Shrestha Mohanty, Vipul Gupta, Vivek Sharma, Vu Minh Chien, Xuhui Zhou, Yizhi Li, Caiming Xiong, Luis Villa, Stella Biderman, HanLin Li, Daphne Ippolito, Sara Hooker, Jad Kabbara, Sandy Pentland
To our knowledge, we conduct the first, large-scale, longitudinal audit of the consent protocols for the web domains underlying AI training corpora.
no code implementations • 25 Jun 2024 • Elinor Poole-Dayan, Deb Roy, Jad Kabbara
While state-of-the-art Large Language Models (LLMs) have shown impressive performance on many tasks, there has been extensive research on undesirable model behavior such as hallucinations and bias.
1 code implementation • 25 May 2024 • Abhishek Kumar, Robert Morabito, Sanzhar Umbet, Jad Kabbara, Ali Emami
Using various datasets and prompting techniques that encourage model introspection, we probe the alignment between models' internal and expressed confidence.
no code implementations • 19 Apr 2024 • Shayne Longpre, Robert Mahari, Naana Obeng-Marnu, William Brannon, Tobin South, Katy Gero, Sandy Pentland, Jad Kabbara
New capabilities in foundation models are owed in large part to massive, widely-sourced, and under-documented training data collections.
1 code implementation • 26 Feb 2024 • Hang Jiang, Xiajie Zhang, Robert Mahari, Daniel Kessler, Eric Ma, Tal August, Irene Li, Alex 'Sandy' Pentland, Yoon Kim, Deb Roy, Jad Kabbara
Finally, we find that learning with stories shows a higher retention rate for non-native speakers in the follow-up assessment.
1 code implementation • 25 Oct 2023 • Shayne Longpre, Robert Mahari, Anthony Chen, Naana Obeng-Marnu, Damien Sileo, William Brannon, Niklas Muennighoff, Nathan Khazam, Jad Kabbara, Kartik Perisetla, Xinyi Wu, Enrico Shippole, Kurt Bollacker, Tongshuang Wu, Luis Villa, Sandy Pentland, Sara Hooker
The race to train language models on vast, diverse, and inconsistently documented datasets has raised pressing concerns about the legal and ethical risks for practitioners.
1 code implementation • 23 May 2023 • Robert Morabito, Jad Kabbara, Ali Emami
Debiasing methods that seek to mitigate the tendency of Language Models (LMs) to occasionally output toxic or inappropriate text have recently gained traction.
1 code implementation • 23 May 2023 • William Brannon, Wonjune Kang, Suyash Fulay, Hang Jiang, Brandon Roy, Deb Roy, Jad Kabbara
Learning on text-attributed graphs (TAGs), in which nodes are associated with one or more texts, has been the subject of much recent work.
1 code implementation • 4 May 2023 • Hang Jiang, Xiajie Zhang, Xubo Cao, Cynthia Breazeal, Deb Roy, Jad Kabbara
Despite the many use cases for large language models (LLMs) in creating personalized chatbots, there has been limited research on evaluating the extent to which the behaviors of personalized LLMs accurately and consistently reflect specific personality traits.
no code implementations • NAACL 2019 • Jad Kabbara
Semantics and pragmatics are two complimentary and intertwined aspects of meaning in language.
no code implementations • ACL 2018 • Andre Cianflone, Yulan Feng, Jad Kabbara, Jackie Chi Kit Cheung
We introduce the novel task of predicting adverbial presupposition triggers, which is useful for natural language generation tasks such as summarization and dialogue systems.
no code implementations • 11 Jun 2018 • Andre Cianflone, Yulan Feng, Jad Kabbara, Jackie Chi Kit Cheung
We introduce the task of predicting adverbial presupposition triggers such as also and again.
no code implementations • COLING 2016 • Jad Kabbara, Yulan Feng, Jackie Chi Kit Cheung
We examine the potential of recurrent neural networks for handling pragmatic inferences involving complex contextual cues for the task of article usage prediction.