Search Results for author: Michael Matena

Found 5 papers, 4 papers with code

NPEFF: Non-Negative Per-Example Fisher Factorization

1 code implementation • 7 Oct 2023 • Michael Matena, Colin Raffel

Using unique properties of NPEFF's parameter-space representations, we ran extensive experiments to verify that the connections between directions in parameters space and examples recovered by NPEFF actually reflect the model's processing.

Paper
Code

A Combinatorial Perspective on the Optimization of Shallow ReLU Networks

no code implementations • 1 Oct 2022 • Michael Matena, Colin Raffel

We explore the implications of this combinatorial aspect of ReLU optimization in this work.

Paper
Add Code

Merging Models with Fisher-Weighted Averaging

2 code implementations • 18 Nov 2021 • Michael Matena, Colin Raffel

Computing a simple average of the models' parameters therefore corresponds to making an isotropic Gaussian approximation to their posteriors.

Domain Adaptation Transfer Learning

Paper
Code

Do Transformer Modifications Transfer Across Implementations and Applications?

1 code implementation • EMNLP 2021 • Sharan Narang, Hyung Won Chung, Yi Tay, William Fedus, Thibault Fevry, Michael Matena, Karishma Malkan, Noah Fiedel, Noam Shazeer, Zhenzhong Lan, Yanqi Zhou, Wei Li, Nan Ding, Jake Marcus, Adam Roberts, Colin Raffel

The research community has proposed copious modifications to the Transformer architecture since it was introduced over three years ago, relatively few of which have seen widespread adoption.

32,793

Paper
Code

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

51 code implementations • arXiv 2019 • Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP).

Ranked #1 on Sentiment Analysis on SST-2 Binary classification

Answer Generation Common Sense Reasoning +11

124,889

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.