1 code implementation • 26 Oct 2023 • Venkata S Govindarajan, Juan Diego Rodriguez, Kaj Bostrom, Kyle Mahowald
We pretrained our masked language models with three ingredients: an initial pretraining with music data, training on shorter sequences before training on longer ones, and masking specific tokens to target some of the BLiMP subtasks.
1 code implementation • 24 Oct 2023 • Zayne Sprague, Xi Ye, Kaj Bostrom, Swarat Chaudhuri, Greg Durrett
We evaluate a range of LLMs and prompting techniques on this dataset and characterize the gaps that remain for techniques like chain-of-thought to perform robust reasoning.
1 code implementation • 5 Jul 2023 • Zayne Sprague, Kaj Bostrom, Swarat Chaudhuri, Greg Durrett
Specifically, we evaluate whether embedding spaces exhibit a property we call deductive additivity: the sum of premise statement embeddings should be close to embeddings of conclusions based on those premises.
2 code implementations • 1 Nov 2022 • Zayne Sprague, Kaj Bostrom, Swarat Chaudhuri, Greg Durrett
A growing body of work studies how to answer a question or verify a claim by generating a natural language "proof": a chain of deductive inferences yielding the answer based on a set of premises.
no code implementations • 16 Jan 2022 • Kaj Bostrom, Zayne Sprague, Swarat Chaudhuri, Greg Durrett
In settings from fact-checking to question answering, we frequently want to know whether a collection of evidence (premises) entails a hypothesis.
1 code implementation • EMNLP 2021 • Kaj Bostrom, Xinyu Zhao, Swarat Chaudhuri, Greg Durrett
Natural language is an attractive representation for this purpose -- it is both highly expressive and easy for humans to understand.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Kaj Bostrom, Greg Durrett
We analyze differences between BPE and unigram LM tokenization, finding that the latter method recovers subword units that align more closely with morphology and avoids problems stemming from BPE's greedy construction procedure.