Search Results for author: R. Thomas McCoy

Found 26 papers, 10 papers with code

Deep de Finetti: Recovering Topic Distributions from Large Language Models

no code implementations21 Dec 2023 Liyi Zhang, R. Thomas McCoy, Theodore R. Sumers, Jian-Qiao Zhu, Thomas L. Griffiths

Large language models (LLMs) can produce long, coherent passages of text, suggesting that LLMs, although trained on next-word prediction, must represent the latent structure that characterizes a document.

Bayesian Inference

Bayes in the age of intelligent machines

no code implementations16 Nov 2023 Thomas L. Griffiths, Jian-Qiao Zhu, Erin Grant, R. Thomas McCoy

The success of methods based on artificial neural networks in creating intelligent machines seems like it might pose a challenge to explanations of human cognition in terms of Bayesian inference.

Bayesian Inference

Embers of Autoregression: Understanding Large Language Models Through the Problem They are Trained to Solve

no code implementations24 Sep 2023 R. Thomas McCoy, Shunyu Yao, Dan Friedman, Matthew Hardy, Thomas L. Griffiths

This approach - which we call the teleological approach - leads us to identify three factors that we hypothesize will influence LLM accuracy: the probability of the task to be performed, the probability of the target output, and the probability of the provided input.

Modeling rapid language learning by distilling Bayesian priors into artificial neural networks

1 code implementation24 May 2023 R. Thomas McCoy, Thomas L. Griffiths

We show that learning from limited naturalistic data is possible with an approach that combines the strong inductive biases of a Bayesian model with the flexible representations of a neural network.

How poor is the stimulus? Evaluating hierarchical generalization in neural networks trained on child-directed speech

1 code implementation26 Jan 2023 Aditya Yedetore, Tal Linzen, Robert Frank, R. Thomas McCoy

When acquiring syntax, children consistently choose hierarchical rules over competing non-hierarchical possibilities.

Neurocompositional computing: From the Central Paradox of Cognition to a new generation of AI systems

no code implementations2 May 2022 Paul Smolensky, R. Thomas McCoy, Roland Fernandez, Matthew Goldrick, Jianfeng Gao

What explains the dramatic progress from 20th-century to 21st-century AI, and how can the remaining limitations of current AI be overcome?

Universal linguistic inductive biases via meta-learning

1 code implementation29 Jun 2020 R. Thomas McCoy, Erin Grant, Paul Smolensky, Thomas L. Griffiths, Tal Linzen

To facilitate computational modeling aimed at addressing this question, we introduce a framework for giving particular linguistic inductive biases to a neural network model; such a model can then be used to empirically explore the effects of those inductive biases.

Language Acquisition Meta-Learning

Representations of Syntax [MASK] Useful: Effects of Constituency and Dependency Structure in Recursive LSTMs

1 code implementation ACL 2020 Michael A. Lepori, Tal Linzen, R. Thomas McCoy

Sequence-based neural networks show significant sensitivity to syntactic structure, but they still perform less well on syntactic tasks than tree-based networks.

Data Augmentation

Syntactic Data Augmentation Increases Robustness to Inference Heuristics

1 code implementation ACL 2020 Junghyun Min, R. Thomas McCoy, Dipanjan Das, Emily Pitler, Tal Linzen

Pretrained neural models such as BERT, when fine-tuned to perform natural language inference (NLI), often show high accuracy on standard datasets, but display a surprising lack of sensitivity to word order on controlled challenge sets.

Data Augmentation Natural Language Inference

Does syntax need to grow on trees? Sources of hierarchical inductive bias in sequence-to-sequence networks

no code implementations TACL 2020 R. Thomas McCoy, Robert Frank, Tal Linzen

We investigate which architectural factors affect the generalization behavior of neural sequence-to-sequence models trained on two syntactic tasks, English question formation and English tense reinflection.

Inductive Bias

RNNs implicitly implement tensor-product representations

1 code implementation ICLR 2019 R. Thomas McCoy, Tal Linzen, Ewan Dunbar, Paul Smolensky

Recurrent neural networks (RNNs) can learn continuous vector representations of symbolic structures such as sequences and sentences; these representations often exhibit linear regularities (analogies).

Representation Learning Sentence

Looking for ELMo's friends: Sentence-Level Pretraining Beyond Language Modeling

no code implementations ICLR 2019 Samuel R. Bowman, Ellie Pavlick, Edouard Grave, Benjamin Van Durme, Alex Wang, Jan Hula, Patrick Xia, Raghavendra Pappagari, R. Thomas McCoy, Roma Patel, Najoung Kim, Ian Tenney, Yinghui Huang, Katherin Yu, Shuning Jin, Berlin Chen

Work on the problem of contextualized word representation—the development of reusable neural network components for sentence understanding—has recently seen a surge of progress centered on the unsupervised pretraining task of language modeling with methods like ELMo (Peters et al., 2018).

Language Modelling Sentence

Probing What Different NLP Tasks Teach Machines about Function Word Comprehension

no code implementations SEMEVAL 2019 Najoung Kim, Roma Patel, Adam Poliak, Alex Wang, Patrick Xia, R. Thomas McCoy, Ian Tenney, Alexis Ross, Tal Linzen, Benjamin Van Durme, Samuel R. Bowman, Ellie Pavlick

Our results show that pretraining on language modeling performs the best on average across our probing tasks, supporting its widespread use for pretraining state-of-the-art NLP models, and CCG supertagging and NLI pretraining perform comparably.

CCG Supertagging Language Modelling +3

Right for the Wrong Reasons: Diagnosing Syntactic Heuristics in Natural Language Inference

5 code implementations ACL 2019 R. Thomas McCoy, Ellie Pavlick, Tal Linzen

We find that models trained on MNLI, including BERT, a state-of-the-art model, perform very poorly on HANS, suggesting that they have indeed adopted these heuristics.

Natural Language Inference Sentence

RNNs Implicitly Implement Tensor Product Representations

no code implementations20 Dec 2018 R. Thomas McCoy, Tal Linzen, Ewan Dunbar, Paul Smolensky

Recurrent neural networks (RNNs) can learn continuous vector representations of symbolic structures such as sequences and sentences; these representations often exhibit linear regularities (analogies).

Representation Learning Sentence

Non-entailed subsequences as a challenge for natural language inference

no code implementations29 Nov 2018 R. Thomas McCoy, Tal Linzen

Neural network models have shown great success at natural language inference (NLI), the task of determining whether a premise entails a hypothesis.

Natural Language Inference Sentence

Cannot find the paper you are looking for? You can Submit a new open access paper.