no code implementations • 27 Jan 2023 • Alex Warstadt, Leshem Choshen, Aaron Mueller, Adina Williams, Ethan Wilcox, Chengxu Zhuang
In partnership with CoNLL and CMCL, we provide a platform for approaches to pretraining with a limited-size corpus sourced from data inspired by the input to children.
no code implementations • 18 Dec 2022 • Koustuv Sinha, Jon Gauthier, Aaron Mueller, Kanishka Misra, Keren Fuentes, Roger Levy, Adina Williams
In this paper, we investigate the stability of language models' performance on targeted syntactic evaluations as we vary properties of the input context: the length of the context, the types of syntactic phenomena it contains, and whether or not there are violations of grammaticality.
1 code implementation • 25 Oct 2022 • Aaron Mueller, Yu Xia, Tal Linzen
However, much of this analysis has focused on monolingual models, and analyses of multilingual models have employed correlational methods that are confounded by the choice of probing tasks.
no code implementations • 26 Aug 2022 • Julian Michael, Ari Holtzman, Alicia Parrish, Aaron Mueller, Alex Wang, Angelica Chen, Divyam Madaan, Nikita Nangia, Richard Yuanzhe Pang, Jason Phang, Samuel R. Bowman
We present the results of the NLP Community Metasurvey.
1 code implementation • ACL 2022 • Aaron Mueller, Jason Krone, Salvatore Romeo, Saab Mansour, Elman Mansimov, Yi Zhang, Dan Roth
Label semantic aware systems have leveraged this information for improved text classification performance during fine-tuning and prediction.
1 code implementation • Findings (ACL) 2022 • Aaron Mueller, Robert Frank, Tal Linzen, Luheng Wang, Sebastian Schuster
We find that pre-trained seq2seq models generalize hierarchically when performing syntactic transformations, whereas models trained from scratch on syntactic transformations do not.
1 code implementation • ACL 2021 • Matthew Finlayson, Aaron Mueller, Sebastian Gehrmann, Stuart Shieber, Tal Linzen, Yonatan Belinkov
Targeted syntactic evaluations have demonstrated the ability of language models to perform subject-verb agreement given difficult contexts.
1 code implementation • NAACL 2021 • Aaron Mueller, Mark Dredze
Neural topic models can augment or replace bag-of-words inputs with the learned representations of deep pre-trained transformer-based word prediction models.
1 code implementation • ACL (GEM) 2021 • Alexandra DeLucia, Aaron Mueller, Xiang Lisa Li, João Sedoc
Narrative generation is an open-ended NLP task in which a model generates a story given a prompt.
no code implementations • 13 Oct 2020 • Aaron Mueller, Zach Wood-Doughty, Silvio Amir, Mark Dredze, Alicia L. Nobles
The #MeToo movement on Twitter has drawn attention to the pervasive nature of sexual harassment and violence.
no code implementations • LREC 2020 • Garrett Nicolai, Dylan Lewis, Arya D. McCarthy, Aaron Mueller, Winston Wu, David Yarowsky
Exploiting the broad translation of the Bible into the world{'}s languages, we train and distribute morphosyntactic tools for approximately one thousand languages, vastly outstripping previous distributions of tools devoted to the processing of inflectional morphology.
no code implementations • LREC 2020 • Aaron Mueller, Garrett Nicolai, Arya D. McCarthy, Dylan Lewis, Winston Wu, David Yarowsky
We find that best practices in this domain are highly language-specific: adding more languages to a training set is often better, but too many harms performance{---}the best number depends on the source language.
no code implementations • LREC 2020 • Arya D. McCarthy, Rachel Wicks, Dylan Lewis, Aaron Mueller, Winston Wu, Oliver Adams, Garrett Nicolai, Matt Post, David Yarowsky
The corpus consists of over 4000 unique translations of the Christian Bible and counting.
2 code implementations • ACL 2020 • Aaron Mueller, Garrett Nicolai, Panayiota Petrou-Zeniou, Natalia Talmina, Tal Linzen
On other constructions, agreement accuracy was generally higher in languages with richer morphology.
1 code implementation • IJCNLP 2019 • Arya D. McCarthy, Winston Wu, Aaron Mueller, Bill Watson, David Yarowsky
There is an extensive history of scholarship into what constitutes a "basic" color term, as well as a broadly attested acquisition sequence of basic color terms across many languages, as articulated in the seminal work of Berlin and Kay (1969).
no code implementations • IJCNLP 2019 • Marten van Schijndel, Aaron Mueller, Tal Linzen
We investigate to what extent these shortcomings can be mitigated by increasing the size of the network and the corpus on which it is trained.