Search Results for author: Kaj Bostrom

Found 7 papers, 5 papers with code

Lil-Bevo: Explorations of Strategies for Training Language Models in More Humanlike Ways

1 code implementation26 Oct 2023 Venkata S Govindarajan, Juan Diego Rodriguez, Kaj Bostrom, Kyle Mahowald

We pretrained our masked language models with three ingredients: an initial pretraining with music data, training on shorter sequences before training on longer ones, and masking specific tokens to target some of the BLiMP subtasks.

Language Modelling Masked Language Modeling

MuSR: Testing the Limits of Chain-of-thought with Multistep Soft Reasoning

1 code implementation24 Oct 2023 Zayne Sprague, Xi Ye, Kaj Bostrom, Swarat Chaudhuri, Greg Durrett

We evaluate a range of LLMs and prompting techniques on this dataset and characterize the gaps that remain for techniques like chain-of-thought to perform robust reasoning.

Deductive Additivity for Planning of Natural Language Proofs

1 code implementation5 Jul 2023 Zayne Sprague, Kaj Bostrom, Swarat Chaudhuri, Greg Durrett

Specifically, we evaluate whether embedding spaces exhibit a property we call deductive additivity: the sum of premise statement embeddings should be close to embeddings of conclusions based on those premises.

Language Modelling Large Language Model +1

Natural Language Deduction with Incomplete Information

2 code implementations1 Nov 2022 Zayne Sprague, Kaj Bostrom, Swarat Chaudhuri, Greg Durrett

A growing body of work studies how to answer a question or verify a claim by generating a natural language "proof": a chain of deductive inferences yielding the answer based on a set of premises.

Text Generation

Natural Language Deduction through Search over Statement Compositions

no code implementations16 Jan 2022 Kaj Bostrom, Zayne Sprague, Swarat Chaudhuri, Greg Durrett

In settings from fact-checking to question answering, we frequently want to know whether a collection of evidence (premises) entails a hypothesis.

Fact Checking Question Answering

Flexible Generation of Natural Language Deductions

1 code implementation EMNLP 2021 Kaj Bostrom, Xinyu Zhao, Swarat Chaudhuri, Greg Durrett

Natural language is an attractive representation for this purpose -- it is both highly expressive and easy for humans to understand.

Sentence

Byte Pair Encoding is Suboptimal for Language Model Pretraining

no code implementations Findings of the Association for Computational Linguistics 2020 Kaj Bostrom, Greg Durrett

We analyze differences between BPE and unigram LM tokenization, finding that the latter method recovers subword units that align more closely with morphology and avoids problems stemming from BPE's greedy construction procedure.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.