no code implementations • 31 Oct 2023 • Peter West, Ximing Lu, Nouha Dziri, Faeze Brahman, Linjie Li, Jena D. Hwang, Liwei Jiang, Jillian Fisher, Abhilasha Ravichander, Khyathi Chandu, Benjamin Newman, Pang Wei Koh, Allyson Ettinger, Yejin Choi
Specifically, we propose and test the Generative AI Paradox hypothesis: generative models, having been trained directly to reproduce expert-like outputs, acquire generative capabilities that are not contingent upon -- and can therefore exceed -- their ability to understand those same types of outputs.
no code implementations • 24 May 2023 • Benjamin Newman, Luca Soldaini, Raymond Fok, Arman Cohan, Kyle Lo
Many real-world applications (e. g., note taking, search) require extracting a sentence or paragraph from a document and showing that snippet to a human outside of the source document.
no code implementations • 26 Feb 2023 • Liye Fu, Benjamin Newman, Maurice Jakesch, Sarah Kreps
Our findings have implications for designing task-appropriate communication assistance systems.
1 code implementation • 16 Nov 2022 • Percy Liang, Rishi Bommasani, Tony Lee, Dimitris Tsipras, Dilara Soylu, Michihiro Yasunaga, Yian Zhang, Deepak Narayanan, Yuhuai Wu, Ananya Kumar, Benjamin Newman, Binhang Yuan, Bobby Yan, Ce Zhang, Christian Cosgrove, Christopher D. Manning, Christopher Ré, Diana Acosta-Navas, Drew A. Hudson, Eric Zelikman, Esin Durmus, Faisal Ladhak, Frieda Rong, Hongyu Ren, Huaxiu Yao, Jue Wang, Keshav Santhanam, Laurel Orr, Lucia Zheng, Mert Yuksekgonul, Mirac Suzgun, Nathan Kim, Neel Guha, Niladri Chatterji, Omar Khattab, Peter Henderson, Qian Huang, Ryan Chi, Sang Michael Xie, Shibani Santurkar, Surya Ganguli, Tatsunori Hashimoto, Thomas Icard, Tianyi Zhang, Vishrav Chaudhary, William Wang, Xuechen Li, Yifan Mai, Yuhui Zhang, Yuta Koreeda
We present Holistic Evaluation of Language Models (HELM) to improve the transparency of language models.
1 code implementation • ICLR 2022 • Benjamin Newman, Prafulla Kumar Choubey, Nazneen Rajani
They take LLM embeddings as input and output continuous prompts that are used to query the LLM.
no code implementations • 29 Sep 2021 • John Hewitt, Xiang Lisa Li, Sang Michael Xie, Benjamin Newman, Percy Liang
When finetuning a pretrained language model for natural language generation tasks, one is currently faced with a tradeoff.
1 code implementation • NAACL 2021 • Benjamin Newman, Kai-Siang Ang, Julia Gong, John Hewitt
Targeted syntactic evaluation of subject-verb number agreement in English (TSE) evaluates language models' syntactic knowledge using hand-crafted minimal pairs of sentences that differ only in the main verb's conjugation.
no code implementations • 14 Oct 2020 • Benjamin Newman, Kevin Carlberg, Ruta Desai
We introduce a novel framework for computing and displaying AR assistance that consists of (1) associating an optimal action sequence with the policy of an embodied agent and (2) presenting this sequence to the user as suggestions in the AR system's heads-up display.
1 code implementation • EMNLP (BlackboxNLP) 2020 • Benjamin Newman, John Hewitt, Percy Liang, Christopher D. Manning
Extrapolation to unseen sequence lengths is a challenge for neural generative models of language.
1 code implementation • SCiL 2020 • Benjamin Newman, Reuben Cohn-Gordon, Christopher Potts
Natural language generation (NLG) systems are commonly evaluated using n-gram overlap measures (e. g. BLEU, ROUGE).