1 code implementation • 17 Jun 2024 • Belinda Z. Li, Emmy Liu, Alexis Ross, Abbas Zeitoun, Graham Neubig, Jacob Andreas
This paper introduces ERASE, which instead improves model behavior when new documents are acquired, by incrementally deleting or rewriting other entries in the knowledge base each time a document is added.
no code implementations • 8 May 2024 • Canaan Breiss, Alexis Ross, Amani Maina-Kilaas, Roger Levy, Jacob Andreas
We propose an interactive approach to language learning that utilizes linguistic acceptability judgments from an informant (a competent language user) to learn a grammar.
no code implementations • 7 May 2024 • Alexis Ross, Jacob Andreas
AdapT has two components: (1) a collection of simulated Bayesian student models that can be used for evaluation of automated teaching methods; (2) a platform for evaluation with human students, to characterize the real-world effectiveness of these methods.
1 code implementation • 5 Jul 2023 • Zhaofeng Wu, Linlu Qiu, Alexis Ross, Ekin Akyürek, Boyuan Chen, Bailin Wang, Najoung Kim, Jacob Andreas, Yoon Kim
The impressive performance of recent language models across a wide range of tasks suggests that they possess a degree of abstract reasoning skills.
1 code implementation • 21 Jun 2023 • Mike D'Arcy, Alexis Ross, Erin Bransom, Bailey Kuehl, Jonathan Bragg, Tom Hope, Doug Downey
We introduce the task of automatically revising scientific papers based on peer feedback and release ARIES, a dataset of review comments and their corresponding paper edits.
no code implementations • 15 Jun 2023 • Ian R. McKenzie, Alexander Lyzhov, Michael Pieler, Alicia Parrish, Aaron Mueller, Ameya Prabhu, Euan McLean, Aaron Kirtland, Alexis Ross, Alisa Liu, Andrew Gritsevskiy, Daniel Wurgaft, Derik Kauffman, Gabriel Recchia, Jiacheng Liu, Joe Cavanagh, Max Weiss, Sicong Huang, The Floating Droid, Tom Tseng, Tomasz Korbak, Xudong Shen, Yuhui Zhang, Zhengping Zhou, Najoung Kim, Samuel R. Bowman, Ethan Perez
Here, we present evidence for the claim that LMs may show inverse scaling, or worse task performance with increased scale, e. g., due to flaws in the training objective and data.
1 code implementation • 26 May 2023 • Marcos Treviso, Alexis Ross, Nuno M. Guerreiro, André F. T. Martins
Selective rationales and counterfactual examples have emerged as two effective, complementary classes of interpretability methods for analyzing and training NLP models.
no code implementations • 24 Oct 2022 • Alexis Ross, Matthew E. Peters, Ana Marasović
Specifically, we evaluate how training self-rationalization models with free-text rationales affects robustness to spurious correlations in fine-tuned encoder-decoder and decoder-only models of six different sizes.
1 code implementation • ACL 2022 • Alexis Ross, Tongshuang Wu, Hao Peng, Matthew E. Peters, Matt Gardner
We craft a set of operations to modify the control codes, which in turn steer generation towards targeted attributes.
no code implementations • EMNLP 2021 • Matt Gardner, William Merrill, Jesse Dodge, Matthew E. Peters, Alexis Ross, Sameer Singh, Noah A. Smith
In this work we argue that for complex language understanding tasks, all simple feature correlations are spurious, and we formalize this notion into a class of problems which we call competency problems.
1 code implementation • Findings (ACL) 2021 • Alexis Ross, Ana Marasović, Matthew E. Peters
Humans have been shown to give contrastive explanations, which explain why an observed event happened rather than some other counterfactual event (the contrast case).
1 code implementation • NeurIPS 2021 • Alexis Ross, Himabindu Lakkaraju, Osbert Bastani
As machine learning models are increasingly deployed in high-stakes domains such as legal and financial decision-making, there has been growing interest in post-hoc methods for generating counterfactual explanations.
no code implementations • IJCNLP 2019 • Alexis Ross, Ellie Pavlick
In natural language inference (NLI), contexts are considered veridical if they allow us to infer that their underlying propositions make true claims about the real world.
no code implementations • SEMEVAL 2019 • Najoung Kim, Roma Patel, Adam Poliak, Alex Wang, Patrick Xia, R. Thomas McCoy, Ian Tenney, Alexis Ross, Tal Linzen, Benjamin Van Durme, Samuel R. Bowman, Ellie Pavlick
Our results show that pretraining on language modeling performs the best on average across our probing tasks, supporting its widespread use for pretraining state-of-the-art NLP models, and CCG supertagging and NLI pretraining perform comparably.