no code implementations • 20 Mar 2024 • Catherine Arnett, Pamela D. Rivière, Tyler A. Chang, Sean Trott
The relationship between language model tokenization and performance is an open area of research.
no code implementations • 13 Mar 2024 • Tyler A. Chang, Katrin Tomanek, Jessica Hoffmann, Nithum Thain, Erin Van Liemt, Kathleen Meier-Hellstern, Lucas Dixon
We explore a strategy to handle controversial topics in LLM-based chatbots based on Wikipedia's Neutral Point of View (NPOV) principle: acknowledge the absence of a single true answer and surface multiple perspectives.
no code implementations • 1 Mar 2024 • Catherine Arnett, Tyler A. Chang, Benjamin K. Bergen
We release a tool to obtain byte premiums for any two languages, enabling comparisons of dataset sizes across languages for more equitable multilingual model development and data practices.
1 code implementation • 15 Nov 2023 • Tyler A. Chang, Catherine Arnett, Zhuowen Tu, Benjamin K. Bergen
However, concrete evidence for the effects of multilinguality on language modeling performance in individual languages remains scarce.
no code implementations • 15 Nov 2023 • James A. Michaelov, Catherine Arnett, Tyler A. Chang, Benjamin K. Bergen
We measure crosslingual structural priming in large language models, comparing model behavior to human experimental results from eight crosslingual experiments covering six languages, and four monolingual structural priming experiments in three non-English languages.
no code implementations • 11 Oct 2023 • Catherine Arnett, Tyler A. Chang, James A. Michaelov, Benjamin K. Bergen
Do multilingual language models share abstract grammatical representations across languages, and if so, when do these develop?
1 code implementation • 29 Aug 2023 • Tyler A. Chang, Zhuowen Tu, Benjamin K. Bergen
We quantify the final surprisal, within-run variability, age of acquisition, forgettability, and cross-run variability of learning curves for individual tokens in context.
1 code implementation • 26 May 2023 • Tyler A. Chang, Kishaloy Halder, Neha Anna John, Yogarshi Vyas, Yassine Benajiba, Miguel Ballesteros, Dan Roth
In this paper, we propose three dimensions of linguistic dataset drift: vocabulary, structural, and semantic drift.
1 code implementation • 20 Mar 2023 • Tyler A. Chang, Benjamin K. Bergen
Transformer language models have received widespread public attention, yet their generated text is often surprising even to NLP researchers.
1 code implementation • 22 May 2022 • Tyler A. Chang, Zhuowen Tu, Benjamin K. Bergen
The subspace means differ along language-sensitive axes that are relatively stable throughout middle layers, and these axes encode information such as token vocabularies.
1 code implementation • 5 Oct 2021 • Tyler A. Chang, Benjamin K. Bergen
We investigate how neural language models acquire individual words during training, extracting learning curves and ages of acquisition for over 600 words on the MacArthur-Bates Communicative Development Inventory (Fenson et al., 2007).
1 code implementation • ACL 2021 • Tyler A. Chang, Yifan Xu, Weijian Xu, Zhuowen Tu
In this paper, we detail the relationship between convolutions and self-attention in natural language tasks.
no code implementations • WS 2020 • Tyler A. Chang, Anna N. Rafferty
We train neural machine translation (NMT) models from English to six target languages, using NMT encoder representations to predict ancestor constituent labels of source language words.