Search Results for author: Najoung Kim

Found 23 papers, 10 papers with code

Testing for Grammatical Category Abstraction in Neural Language Models

no code implementations • SCiL 2021 • Najoung Kim, Paul Smolensky

Paper
Add Code

Syn-QA2: Evaluating False Assumptions in Long-tail Questions with Synthetic QA Datasets

1 code implementation • 18 Mar 2024 • Ashwin Daswani, Rohan Sawant, Najoung Kim

Sensitivity to false assumptions (or false premises) in information-seeking questions is critical for robust question-answering (QA) systems.

Question Answering

Paper
Code

Abstraction via exemplars? A representational case study on lexical category inference in BERT

no code implementations • 3 Nov 2023 • Kanishka Misra, Najoung Kim

Exemplar based accounts are often considered to be in direct opposition to pure linguistic abstraction in explaining language learners' ability to generalize to novel expressions.

Paper
Add Code

Personas as a Way to Model Truthfulness in Language Models

no code implementations • 27 Oct 2023 • Nitish Joshi, Javier Rando, Abulhair Saparov, Najoung Kim, He He

This allows the model to separate truth from falsehoods and controls the truthfulness of its generation.

Paper
Add Code

SLOG: A Structural Generalization Benchmark for Semantic Parsing

1 code implementation • 23 Oct 2023 • Bingzhi Li, Lucia Donatelli, Alexander Koller, Tal Linzen, Yuekun Yao, Najoung Kim

The goal of compositional generalization benchmarks is to evaluate how well models generalize to new complex linguistic expressions.

Semantic Parsing

Paper
Code

Reasoning or Reciting? Exploring the Capabilities and Limitations of Language Models Through Counterfactual Tasks

1 code implementation • 5 Jul 2023 • Zhaofeng Wu, Linlu Qiu, Alexis Ross, Ekin Akyürek, Boyuan Chen, Bailin Wang, Najoung Kim, Jacob Andreas, Yoon Kim

The impressive performance of recent language models across a wide range of tasks suggests that they possess a degree of abstract reasoning skills.

counterfactual Language Modelling

Paper
Code

Inverse Scaling: When Bigger Isn't Better

no code implementations • 15 Jun 2023 • Ian R. McKenzie, Alexander Lyzhov, Michael Pieler, Alicia Parrish, Aaron Mueller, Ameya Prabhu, Euan McLean, Aaron Kirtland, Alexis Ross, Alisa Liu, Andrew Gritsevskiy, Daniel Wurgaft, Derik Kauffman, Gabriel Recchia, Jiacheng Liu, Joe Cavanagh, Max Weiss, Sicong Huang, The Floating Droid, Tom Tseng, Tomasz Korbak, Xudong Shen, Yuhui Zhang, Zhengping Zhou, Najoung Kim, Samuel R. Bowman, Ethan Perez

Here, we present evidence for the claim that LMs may show inverse scaling, or worse task performance with increased scale, e. g., due to flaws in the training objective and data.

Paper
Add Code

Testing the General Deductive Reasoning Capacity of Large Language Models Using OOD Examples

1 code implementation • NeurIPS 2023 • Abulhair Saparov, Richard Yuanzhe Pang, Vishakh Padmakumar, Nitish Joshi, Seyed Mehran Kazemi, Najoung Kim, He He

Given the intractably large size of the space of proofs, any model that is capable of general deductive reasoning must generalize to proofs of greater complexity.

Paper
Code

Entity Tracking in Language Models

1 code implementation • 3 May 2023 • Najoung Kim, Sebastian Schuster

We use this task to first investigate whether Flan-T5, GPT-3 and GPT-3. 5 can track the state of entities, and find that only GPT-3. 5 models, which have been pretrained on large amounts of code, exhibit this ability.

Language Modelling

Paper
Code

Uncontrolled Lexical Exposure Leads to Overestimation of Compositional Generalization in Pretrained Models

1 code implementation • 21 Dec 2022 • Najoung Kim, Tal Linzen, Paul Smolensky

Human linguistic capacity is often characterized by compositionality and the generalization it enables -- human learners can produce and comprehend novel complex expressions by composing known parts.

Paper
Code

Reconstruction Probing

no code implementations • 21 Dec 2022 • Najoung Kim, Jatin Khilnani, Alex Warstadt, Abed Qaddoumi

We propose reconstruction probing, a new analysis method for contextualized representations based on reconstruction probabilities in masked language models (MLMs).

Paper
Add Code

(QA)$^2$: Question Answering with Questionable Assumptions

1 code implementation • 20 Dec 2022 • Najoung Kim, Phu Mon Htut, Samuel R. Bowman, Jackson Petty

Naturally occurring information-seeking questions often contain questionable assumptions -- assumptions that are false or unverifiable.

Question Answering

Paper
Code

LAMBADA: Backward Chaining for Automated Reasoning in Natural Language

no code implementations • 20 Dec 2022 • Mehran Kazemi, Najoung Kim, Deepti Bhatia, Xin Xu, Deepak Ramachandran

Remarkable progress has been made on automated reasoning with natural text, by using Language Models (LMs) and methods such as Chain-of-Thought and Selection-Inference.

LAMBADA Logical Reasoning

Paper
Add Code

BLOOM: A 176B-Parameter Open-Access Multilingual Language Model

6 code implementations • 9 Nov 2022 • BigScience Workshop, :, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana Ilić, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, Matthias Gallé, Jonathan Tow, Alexander M. Rush, Stella Biderman, Albert Webson, Pawan Sasanka Ammanamanchi, Thomas Wang, Benoît Sagot, Niklas Muennighoff, Albert Villanova del Moral, Olatunji Ruwase, Rachel Bawden, Stas Bekman, Angelina McMillan-Major, Iz Beltagy, Huu Nguyen, Lucile Saulnier, Samson Tan, Pedro Ortiz Suarez, Victor Sanh, Hugo Laurençon, Yacine Jernite, Julien Launay, Margaret Mitchell, Colin Raffel, Aaron Gokaslan, Adi Simhi, Aitor Soroa, Alham Fikri Aji, Amit Alfassy, Anna Rogers, Ariel Kreisberg Nitzav, Canwen Xu, Chenghao Mou, Chris Emezue, Christopher Klamm, Colin Leong, Daniel van Strien, David Ifeoluwa Adelani, Dragomir Radev, Eduardo González Ponferrada, Efrat Levkovizh, Ethan Kim, Eyal Bar Natan, Francesco De Toni, Gérard Dupont, Germán Kruszewski, Giada Pistilli, Hady Elsahar, Hamza Benyamina, Hieu Tran, Ian Yu, Idris Abdulmumin, Isaac Johnson, Itziar Gonzalez-Dios, Javier de la Rosa, Jenny Chim, Jesse Dodge, Jian Zhu, Jonathan Chang, Jörg Frohberg, Joseph Tobing, Joydeep Bhattacharjee, Khalid Almubarak, Kimbo Chen, Kyle Lo, Leandro von Werra, Leon Weber, Long Phan, Loubna Ben allal, Ludovic Tanguy, Manan Dey, Manuel Romero Muñoz, Maraim Masoud, María Grandury, Mario Šaško, Max Huang, Maximin Coavoux, Mayank Singh, Mike Tian-Jian Jiang, Minh Chien Vu, Mohammad A. Jauhar, Mustafa Ghaleb, Nishant Subramani, Nora Kassner, Nurulaqilla Khamis, Olivier Nguyen, Omar Espejel, Ona de Gibert, Paulo Villegas, Peter Henderson, Pierre Colombo, Priscilla Amuok, Quentin Lhoest, Rheza Harliman, Rishi Bommasani, Roberto Luis López, Rui Ribeiro, Salomey Osei, Sampo Pyysalo, Sebastian Nagel, Shamik Bose, Shamsuddeen Hassan Muhammad, Shanya Sharma, Shayne Longpre, Somaieh Nikpoor, Stanislav Silberberg, Suhas Pai, Sydney Zink, Tiago Timponi Torrent, Timo Schick, Tristan Thrush, Valentin Danchev, Vassilina Nikoulina, Veronika Laippala, Violette Lepercq, Vrinda Prabhu, Zaid Alyafeai, Zeerak Talat, Arun Raja, Benjamin Heinzerling, Chenglei Si, Davut Emre Taşar, Elizabeth Salesky, Sabrina J. Mielke, Wilson Y. Lee, Abheesht Sharma, Andrea Santilli, Antoine Chaffin, Arnaud Stiegler, Debajyoti Datta, Eliza Szczechla, Gunjan Chhablani, Han Wang, Harshit Pandey, Hendrik Strobelt, Jason Alan Fries, Jos Rozen, Leo Gao, Lintang Sutawika, M Saiful Bari, Maged S. Al-shaibani, Matteo Manica, Nihal Nayak, Ryan Teehan, Samuel Albanie, Sheng Shen, Srulik Ben-David, Stephen H. Bach, Taewoon Kim, Tali Bers, Thibault Fevry, Trishala Neeraj, Urmish Thakker, Vikas Raunak, Xiangru Tang, Zheng-Xin Yong, Zhiqing Sun, Shaked Brody, Yallow Uri, Hadar Tojarieh, Adam Roberts, Hyung Won Chung, Jaesung Tae, Jason Phang, Ofir Press, Conglong Li, Deepak Narayanan, Hatim Bourfoune, Jared Casper, Jeff Rasley, Max Ryabinin, Mayank Mishra, Minjia Zhang, Mohammad Shoeybi, Myriam Peyrounette, Nicolas Patry, Nouamane Tazi, Omar Sanseviero, Patrick von Platen, Pierre Cornette, Pierre François Lavallée, Rémi Lacroix, Samyam Rajbhandari, Sanchit Gandhi, Shaden Smith, Stéphane Requena, Suraj Patil, Tim Dettmers, Ahmed Baruwa, Amanpreet Singh, Anastasia Cheveleva, Anne-Laure Ligozat, Arjun Subramonian, Aurélie Névéol, Charles Lovering, Dan Garrette, Deepak Tunuguntla, Ehud Reiter, Ekaterina Taktasheva, Ekaterina Voloshina, Eli Bogdanov, Genta Indra Winata, Hailey Schoelkopf, Jan-Christoph Kalo, Jekaterina Novikova, Jessica Zosa Forde, Jordan Clive, Jungo Kasai, Ken Kawamura, Liam Hazan, Marine Carpuat, Miruna Clinciu, Najoung Kim, Newton Cheng, Oleg Serikov, Omer Antverg, Oskar van der Wal, Rui Zhang, Ruochen Zhang, Sebastian Gehrmann, Shachar Mirkin, Shani Pais, Tatiana Shavrina, Thomas Scialom, Tian Yun, Tomasz Limisiewicz, Verena Rieser, Vitaly Protasov, Vladislav Mikhailov, Yada Pruksachatkun, Yonatan Belinkov, Zachary Bamberger, Zdeněk Kasner, Alice Rueda, Amanda Pestana, Amir Feizpour, Ammar Khan, Amy Faranak, Ana Santos, Anthony Hevia, Antigona Unldreaj, Arash Aghagol, Arezoo Abdollahi, Aycha Tammour, Azadeh HajiHosseini, Bahareh Behroozi, Benjamin Ajibade, Bharat Saxena, Carlos Muñoz Ferrandis, Daniel McDuff, Danish Contractor, David Lansky, Davis David, Douwe Kiela, Duong A. Nguyen, Edward Tan, Emi Baylor, Ezinwanne Ozoani, Fatima Mirza, Frankline Ononiwu, Habib Rezanejad, Hessie Jones, Indrani Bhattacharya, Irene Solaiman, Irina Sedenko, Isar Nejadgholi, Jesse Passmore, Josh Seltzer, Julio Bonis Sanz, Livia Dutra, Mairon Samagaio, Maraim Elbadri, Margot Mieskes, Marissa Gerchick, Martha Akinlolu, Michael McKenna, Mike Qiu, Muhammed Ghauri, Mykola Burynok, Nafis Abrar, Nazneen Rajani, Nour Elkott, Nour Fahmy, Olanrewaju Samuel, Ran An, Rasmus Kromann, Ryan Hao, Samira Alizadeh, Sarmad Shubber, Silas Wang, Sourav Roy, Sylvain Viguier, Thanh Le, Tobi Oyebade, Trieu Le, Yoyo Yang, Zach Nguyen, Abhinav Ramesh Kashyap, Alfredo Palasciano, Alison Callahan, Anima Shukla, Antonio Miranda-Escalada, Ayush Singh, Benjamin Beilharz, Bo wang, Caio Brito, Chenxi Zhou, Chirag Jain, Chuxin Xu, Clémentine Fourrier, Daniel León Periñán, Daniel Molano, Dian Yu, Enrique Manjavacas, Fabio Barth, Florian Fuhrimann, Gabriel Altay, Giyaseddin Bayrak, Gully Burns, Helena U. Vrabec, Imane Bello, Ishani Dash, Jihyun Kang, John Giorgi, Jonas Golde, Jose David Posada, Karthik Rangasai Sivaraman, Lokesh Bulchandani, Lu Liu, Luisa Shinzato, Madeleine Hahn de Bykhovetz, Maiko Takeuchi, Marc Pàmies, Maria A Castillo, Marianna Nezhurina, Mario Sänger, Matthias Samwald, Michael Cullan, Michael Weinberg, Michiel De Wolf, Mina Mihaljcic, Minna Liu, Moritz Freidank, Myungsun Kang, Natasha Seelam, Nathan Dahlberg, Nicholas Michio Broad, Nikolaus Muellner, Pascale Fung, Patrick Haller, Ramya Chandrasekhar, Renata Eisenberg, Robert Martin, Rodrigo Canalli, Rosaline Su, Ruisi Su, Samuel Cahyawijaya, Samuele Garda, Shlok S Deshmukh, Shubhanshu Mishra, Sid Kiblawi, Simon Ott, Sinee Sang-aroonsiri, Srishti Kumar, Stefan Schweter, Sushil Bharati, Tanmay Laud, Théo Gigant, Tomoya Kainuma, Wojciech Kusa, Yanis Labrak, Yash Shailesh Bajaj, Yash Venkatraman, Yifan Xu, Yingxin Xu, Yu Xu, Zhe Tan, Zhongli Xie, Zifan Ye, Mathilde Bras, Younes Belkada, Thomas Wolf

Large language models (LLMs) have been shown to be able to perform new tasks based on a few demonstrations or natural language instructions.

Language Modelling Multilingual NLP

2,194

Paper
Code

Inverse scaling can become U-shaped

no code implementations • 3 Nov 2022 • Jason Wei, Najoung Kim, Yi Tay, Quoc V. Le

The Inverse Scaling Prize (McKenzie et al. 2022) identified eleven such inverse scaling tasks, evaluated on models of up to 280B parameters and up to 500 zettaFLOPs of training compute.

Attribute

Paper
Add Code

Which Linguist Invented the Lightbulb? Presupposition Verification for Question-Answering

no code implementations • ACL 2021 • Najoung Kim, Ellie Pavlick, Burcu Karagol Ayan, Deepak Ramachandran

Through a user preference study, we demonstrate that the oracle behavior of our proposed system that provides responses based on presupposition failure is preferred over the oracle behavior of existing QA systems.

Explanation Generation Natural Questions +1

Paper
Add Code

COGS: A Compositional Generalization Challenge Based on Semantic Interpretation

1 code implementation • EMNLP 2020 • Najoung Kim, Tal Linzen

Natural language is characterized by compositionality: the meaning of a complex expression is constructed from the meanings of its constituent parts.

Position Semantic Parsing

Paper
Code

Implicit Discourse Relation Classification: We Need to Talk about Evaluation

no code implementations • ACL 2020 • Najoung Kim, Song Feng, Chulaka Gunasekara, Luis Lastras

Implicit relation classification on Penn Discourse TreeBank (PDTB) 2. 0 is a common benchmark task for evaluating the understanding of discourse relations.

Classification General Classification +3

Paper
Add Code

What do you learn from context? Probing for sentence structure in contextualized word representations

2 code implementations • ICLR 2019 • Ian Tenney, Patrick Xia, Berlin Chen, Alex Wang, Adam Poliak, R. Thomas McCoy, Najoung Kim, Benjamin Van Durme, Samuel R. Bowman, Dipanjan Das, Ellie Pavlick

The jiant toolkit for general-purpose text understanding models

Language Modelling Sentence

Paper
Code

Looking for ELMo's friends: Sentence-Level Pretraining Beyond Language Modeling

no code implementations • ICLR 2019 • Samuel R. Bowman, Ellie Pavlick, Edouard Grave, Benjamin Van Durme, Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen

Work on the problem of contextualized word representation—the development of reusable neural network components for sentence understanding—has recently seen a surge of progress centered on the unsupervised pretraining task of language modeling with methods like ELMo (Peters et al., 2018).

Language Modelling Sentence

Paper
Add Code

Probing What Different NLP Tasks Teach Machines about Function Word Comprehension

no code implementations • SEMEVAL 2019 • Najoung Kim, Roma Patel, Adam Poliak, Alex Wang, Patrick Xia, R. Thomas McCoy, Ian Tenney, Alexis Ross, Tal Linzen, Benjamin Van Durme, Samuel R. Bowman, Ellie Pavlick

Our results show that pretraining on language modeling performs the best on average across our probing tasks, supporting its widespread use for pretraining state-of-the-art NLP models, and CCG supertagging and NLI pretraining perform comparably.

CCG Supertagging Language Modelling +3

Paper
Add Code

Can You Tell Me How to Get Past Sesame Street? Sentence-Level Pretraining Beyond Language Modeling

no code implementations • ACL 2019 • Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen, Benjamin Van Durme, Edouard Grave, Ellie Pavlick, Samuel R. Bowman

Natural language understanding has recently seen a surge of progress with the use of sentence encoders like ELMo (Peters et al., 2018a) and BERT (Devlin et al., 2019) which are pretrained on variants of language modeling.

Language Modelling Natural Language Understanding +2

Paper
Add Code

Predicting the Argumenthood of English Prepositional Phrases

no code implementations • 20 Sep 2018 • Najoung Kim, Kyle Rawlins, Benjamin Van Durme, Paul Smolensky

Distinguishing between arguments and adjuncts of a verb is a longstanding, nontrivial problem.

Semantic Role Labeling Sentence +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.