Search Results for author: Davis Yoshida

Found 6 papers, 1 papers with code

MAP's not dead yet: Uncovering true language model modes by conditioning away degeneracy

no code implementations • 15 Nov 2023 • Davis Yoshida, Kartik Goyal, Kevin Gimpel

It has been widely observed that exact or approximate MAP (mode-seeking) decoding from natural language generation (NLG) models consistently leads to degenerate outputs (Stahlberg and Byrne, 2019, Holtzman et al., 2019).

Instruction Following Language Modelling +2

Paper
Add Code

NF4 Isn't Information Theoretically Optimal (and that's Good)

1 code implementation • 12 Jun 2023 • Davis Yoshida

This note shares some simple calculations and experiments related to absmax-based blockwise quantization, as used in Dettmers et al., 2023.

Quantization

Paper
Code

Reconsidering the Past: Optimizing Hidden States in Language Models

no code implementations • Findings (EMNLP) 2021 • Davis Yoshida, Kevin Gimpel

We present Hidden-State Optimization (HSO), a gradient-based method for improving the performance of transformer language models at inference time.

Language Modelling

Paper
Add Code

Adding Recurrence to Pretrained Transformers

no code implementations • 1 Jan 2021 • Davis Yoshida, Allyson Ettinger, Kevin Gimpel

Fine-tuning a pretrained transformer for a downstream task has become a standard method in NLP in the last few years.

Language Modelling

Paper
Add Code

Adding Recurrence to Pretrained Transformers for Improved Efficiency and Context Size

no code implementations • 16 Aug 2020 • Davis Yoshida, Allyson Ettinger, Kevin Gimpel

Fine-tuning a pretrained transformer for a downstream task has become a standard method in NLP in the last few years.

Language Modelling

Paper
Add Code

Using Confusion Graphs to Understand Classifier Error

no code implementations • WS 2016 • Davis Yoshida, Jordan Boyd-Graber

Question Answering

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.