1 code implementation • 24 May 2023 • R. Thomas McCoy, Thomas L. Griffiths
We show that learning from limited naturalistic data is possible with an approach that combines the strong inductive biases of a Bayesian model with the flexible representations of a neural network.
no code implementations • 26 Jan 2023 • Aditya Yedetore, Tal Linzen, Robert Frank, R. Thomas McCoy
When acquiring syntax, children consistently choose hierarchical rules over competing non-hierarchical possibilities.
no code implementations • MTSummit 2021 • Paul Soulos, Sudha Rao, Caitlin Smith, Eric Rosen, Asli Celikyilmaz, R. Thomas McCoy, Yichen Jiang, Coleman Haley, Roland Fernandez, Hamid Palangi, Jianfeng Gao, Paul Smolensky
Machine translation has seen rapid progress with the advent of Transformer-based models.
no code implementations • 2 May 2022 • Paul Smolensky, R. Thomas McCoy, Roland Fernandez, Matthew Goldrick, Jianfeng Gao
What explains the dramatic progress from 20th-century to 21st-century AI, and how can the remaining limitations of current AI be overcome?
no code implementations • 18 Nov 2021 • R. Thomas McCoy, Paul Smolensky, Tal Linzen, Jianfeng Gao, Asli Celikyilmaz
We apply these analyses to four neural language models (an LSTM, a Transformer, Transformer-XL, and GPT-2).
1 code implementation • COLING 2020 • Michael A. Lepori, R. Thomas McCoy
As the name implies, contextualized representations of language are typically motivated by their ability to encode context.
1 code implementation • 29 Jun 2020 • R. Thomas McCoy, Erin Grant, Paul Smolensky, Thomas L. Griffiths, Tal Linzen
To facilitate computational modeling aimed at addressing this question, we introduce a framework for giving particular linguistic inductive biases to a neural network model; such a model can then be used to empirically explore the effects of those inductive biases.
1 code implementation • ACL 2020 • Michael A. Lepori, Tal Linzen, R. Thomas McCoy
Sequence-based neural networks show significant sensitivity to syntactic structure, but they still perform less well on syntactic tasks than tree-based networks.
1 code implementation • ACL 2020 • Junghyun Min, R. Thomas McCoy, Dipanjan Das, Emily Pitler, Tal Linzen
Pretrained neural models such as BERT, when fine-tuned to perform natural language inference (NLI), often show high accuracy on standard datasets, but display a surprising lack of sensitivity to word order on controlled challenge sets.
no code implementations • TACL 2020 • R. Thomas McCoy, Robert Frank, Tal Linzen
We investigate which architectural factors affect the generalization behavior of neural sequence-to-sequence models trained on two syntactic tasks, English question formation and English tense reinflection.
1 code implementation • EMNLP (BlackboxNLP) 2020 • R. Thomas McCoy, Junghyun Min, Tal Linzen
If the same neural network architecture is trained multiple times on the same dataset, will it make similar linguistic generalizations across runs?
2 code implementations • ICLR 2019 • Ian Tenney, Patrick Xia, Berlin Chen, Alex Wang, Adam Poliak, R. Thomas McCoy, Najoung Kim, Benjamin Van Durme, Samuel R. Bowman, Dipanjan Das, Ellie Pavlick
The jiant toolkit for general-purpose text understanding models
no code implementations • ICLR 2019 • Samuel R. Bowman, Ellie Pavlick, Edouard Grave, Benjamin Van Durme, Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen
Work on the problem of contextualized word representation—the development of reusable neural network components for sentence understanding—has recently seen a surge of progress centered on the unsupervised pretraining task of language modeling with methods like ELMo (Peters et al., 2018).
1 code implementation • ICLR 2019 • R. Thomas McCoy, Tal Linzen, Ewan Dunbar, Paul Smolensky
Recurrent neural networks (RNNs) can learn continuous vector representations of symbolic structures such as sequences and sentences; these representations often exhibit linear regularities (analogies).
no code implementations • SEMEVAL 2019 • Najoung Kim, Roma Patel, Adam Poliak, Alex Wang, Patrick Xia, R. Thomas McCoy, Ian Tenney, Alexis Ross, Tal Linzen, Benjamin Van Durme, Samuel R. Bowman, Ellie Pavlick
Our results show that pretraining on language modeling performs the best on average across our probing tasks, supporting its widespread use for pretraining state-of-the-art NLP models, and CCG supertagging and NLI pretraining perform comparably.
5 code implementations • ACL 2019 • R. Thomas McCoy, Ellie Pavlick, Tal Linzen
We find that models trained on MNLI, including BERT, a state-of-the-art model, perform very poorly on HANS, suggesting that they have indeed adopted these heuristics.
no code implementations • ACL 2019 • Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen, Benjamin Van Durme, Edouard Grave, Ellie Pavlick, Samuel R. Bowman
Natural language understanding has recently seen a surge of progress with the use of sentence encoders like ELMo (Peters et al., 2018a) and BERT (Devlin et al., 2019) which are pretrained on variants of language modeling.
no code implementations • 20 Dec 2018 • R. Thomas McCoy, Tal Linzen, Ewan Dunbar, Paul Smolensky
Recurrent neural networks (RNNs) can learn continuous vector representations of symbolic structures such as sequences and sentences; these representations often exhibit linear regularities (analogies).
no code implementations • 29 Nov 2018 • R. Thomas McCoy, Tal Linzen
Neural network models have shown great success at natural language inference (NLI), the task of determining whether a premise entails a hypothesis.
no code implementations • 25 Feb 2018 • R. Thomas McCoy, Robert Frank, Tal Linzen
We examine this proposal using recurrent neural networks (RNNs), which are not constrained in such a way.