1 code implementation • 30 Oct 2024 • Apoorv Khandelwal, Tian Yun, Nihal V. Nayak, Jack Merullo, Stephen H. Bach, Chen Sun, Ellie Pavlick
We introduce a benchmark to measure the time to pre-train models on given GPUs and also identify ideal settings for maximizing training speed.
no code implementations • 13 Jun 2024 • Jack Merullo, Carsten Eickhoff, Ellie Pavlick
Although it is known that transformer language models (LMs) pass features from early layers to later layers, it is not well understood how this information is represented and routed by the model.
no code implementations • 28 May 2024 • Suraj Anand, Michael A. Lepori, Jack Merullo, Ellie Pavlick
Hence, we study $\textbf{structural in-context learning}$, which we define as the ability of a model to execute in-context learning on arbitrary tokens -- so called because the model must generalize on the basis of e. g. sentence structure or task structure, rather than semantic content encoded in token embeddings.
1 code implementation • 3 May 2024 • Catherine Chen, Jack Merullo, Carsten Eickhoff
Neural models have demonstrated remarkable performance across diverse ranking tasks.
no code implementations • 13 Feb 2024 • Aaron Traylor, Jack Merullo, Michael J. Frank, Ellie Pavlick
Models based on the Transformer neural network architecture have seen success on a wide variety of tasks that appear to require complex "cognitive branching" -- or the ability to maintain pursuit of one goal while accomplishing others.
no code implementations • 24 Oct 2023 • Qinan Yu, Jack Merullo, Ellie Pavlick
By scaling up or down the value vector of these heads, we can control the likelihood of using the in-context answer on new data.
1 code implementation • 12 Oct 2023 • Jack Merullo, Carsten Eickhoff, Ellie Pavlick
that it is mostly reused to solve a seemingly different task: Colored Objects (Ippolito & Callison-Burch, 2023).
1 code implementation • 25 May 2023 • Jack Merullo, Carsten Eickhoff, Ellie Pavlick
A primary criticism towards language models (LMs) is their inscrutability.
1 code implementation • 20 Dec 2022 • Martha Lewis, Nihal V. Nayak, Peilin Yu, Qinan Yu, Jack Merullo, Stephen H. Bach, Ellie Pavlick
Large-scale neural network models combining text and images have made incredible progress in recent years.
1 code implementation • 13 Oct 2022 • Ankita Gupta, Marzena Karpinska, Wenlong Zhao, Kalpesh Krishna, Jack Merullo, Luke Yeh, Mohit Iyyer, Brendan O'Connor
Large-scale, high-quality corpora are critical for advancing research in coreference resolution.
2 code implementations • 30 Sep 2022 • Jack Merullo, Louis Castricato, Carsten Eickhoff, Ellie Pavlick
Prior work has shown that pretrained LMs can be taught to caption images when a vision model's parameters are optimized to encode images in the language space.
1 code implementation • *SEM (NAACL) 2022 • Jack Merullo, Dylan Ebert, Carsten Eickhoff, Ellie Pavlick
Lexical semantics and cognitive science point to affordances (i. e. the actions that objects support) as critical for understanding and representing nouns and verbs.
1 code implementation • IJCNLP 2019 • Jack Merullo, Luke Yeh, Abram Handler, Alvin Grissom II, Brendan O'Connor, Mohit Iyyer
Sports broadcasters inject drama into play-by-play commentary by building team and player narratives through subjective analyses and anecdotes.