no code implementations • 14 Mar 2023 • Kelvin Guu, Albert Webson, Ellie Pavlick, Lucas Dixon, Ian Tenney, Tolga Bolukbasi
To study such interactions, we propose Simfluence, a new paradigm for TDA where the goal is not to produce a single influence score per example, but instead a training run simulator: the user asks, ``If my model had trained on example $z_1$, then $z_2$, ..., then $z_n$, how would it behave on $z_{test}$?
1 code implementation • 23 May 2022 • Ekin Akyürek, Tolga Bolukbasi, Frederick Liu, Binbin Xiong, Ian Tenney, Jacob Andreas, Kelvin Guu
In this paper, we propose the problem of fact tracing: identifying which training examples taught an LM to generate a particular factual assertion.
no code implementations • ACL 2022 • Bhargavi Paranjape, Matthew Lamm, Ian Tenney
To address these challenges, we develop a Retrieve-Generate-Filter(RGF) technique to create counterfactual evaluation and training data with minimal human supervision.
2 code implementations • ICLR 2022 • Thibault Sellam, Steve Yadlowsky, Jason Wei, Naomi Saphra, Alexander D'Amour, Tal Linzen, Jasmijn Bastings, Iulia Turc, Jacob Eisenstein, Dipanjan Das, Ian Tenney, Ellie Pavlick
Experiments with pre-trained models such as BERT are often based on a single checkpoint.
no code implementations • 12 Oct 2020 • Kellie Webster, Xuezhi Wang, Ian Tenney, Alex Beutel, Emily Pitler, Ellie Pavlick, Jilin Chen, Ed Chi, Slav Petrov
Pre-trained models have revolutionized natural language understanding.
no code implementations • EMNLP (BlackboxNLP) 2020 • Xikun Zhang, Deepak Ramachandran, Ian Tenney, Yanai Elazar, Dan Roth
Pretrained Language Models (LMs) have been shown to possess significant linguistic, common sense, and factual knowledge.
1 code implementation • EMNLP 2020 • Ian Tenney, James Wexler, Jasmijn Bastings, Tolga Bolukbasi, Andy Coenen, Sebastian Gehrmann, Ellen Jiang, Mahima Pushkarna, Carey Radebaugh, Emily Reif, Ann Yuan
We present the Language Interpretability Tool (LIT), an open-source platform for visualization and understanding of NLP models.
no code implementations • EMNLP (BlackboxNLP) 2020 • Amil Merchant, Elahe Rahimtoroghi, Ellie Pavlick, Ian Tenney
While there has been much recent work studying how linguistic information is encoded in pre-trained sentence representations, comparatively little is understood about how these models change when adapted to solve downstream tasks.
no code implementations • EMNLP 2020 • Julian Michael, Jan A. Botha, Ian Tenney
The success of pretrained contextual encoders, such as ELMo and BERT, has brought a great deal of interest in what these models learn: do they, without explicit supervision, learn to encode meaningful notions of linguistic structure?
6 code implementations • ACL 2020 • Yada Pruksachatkun, Phil Yeres, Haokun Liu, Jason Phang, Phu Mon Htut, Alex Wang, Ian Tenney, Samuel R. Bowman
We introduce jiant, an open source toolkit for conducting multitask and transfer learning experiments on English NLU tasks.
1 code implementation • ACL 2019 • Ian Tenney, Dipanjan Das, Ellie Pavlick
Pre-trained text encoders have rapidly advanced the state of the art on many NLP tasks.
2 code implementations • ICLR 2019 • Ian Tenney, Patrick Xia, Berlin Chen, Alex Wang, Adam Poliak, R. Thomas McCoy, Najoung Kim, Benjamin Van Durme, Samuel R. Bowman, Dipanjan Das, Ellie Pavlick
The jiant toolkit for general-purpose text understanding models
no code implementations • ICLR 2019 • Samuel R. Bowman, Ellie Pavlick, Edouard Grave, Benjamin Van Durme, Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen
Work on the problem of contextualized word representation—the development of reusable neural network components for sentence understanding—has recently seen a surge of progress centered on the unsupervised pretraining task of language modeling with methods like ELMo (Peters et al., 2018).
no code implementations • SEMEVAL 2019 • Najoung Kim, Roma Patel, Adam Poliak, Alex Wang, Patrick Xia, R. Thomas McCoy, Ian Tenney, Alexis Ross, Tal Linzen, Benjamin Van Durme, Samuel R. Bowman, Ellie Pavlick
Our results show that pretraining on language modeling performs the best on average across our probing tasks, supporting its widespread use for pretraining state-of-the-art NLP models, and CCG supertagging and NLI pretraining perform comparably.
no code implementations • ACL 2019 • Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen, Benjamin Van Durme, Edouard Grave, Ellie Pavlick, Samuel R. Bowman
Natural language understanding has recently seen a surge of progress with the use of sentence encoders like ELMo (Peters et al., 2018a) and BERT (Devlin et al., 2019) which are pretrained on variants of language modeling.
no code implementations • EMNLP 2018 • Manaal Faruqui, Ellie Pavlick, Ian Tenney, Dipanjan Das
We release a corpus of 43 million atomic edits across 8 languages.