Search Results for author: Michael Matena

Found 5 papers, 4 papers with code

NPEFF: Non-Negative Per-Example Fisher Factorization

1 code implementation7 Oct 2023 Michael Matena, Colin Raffel

Using unique properties of NPEFF's parameter-space representations, we ran extensive experiments to verify that the connections between directions in parameters space and examples recovered by NPEFF actually reflect the model's processing.

A Combinatorial Perspective on the Optimization of Shallow ReLU Networks

no code implementations1 Oct 2022 Michael Matena, Colin Raffel

We explore the implications of this combinatorial aspect of ReLU optimization in this work.

Merging Models with Fisher-Weighted Averaging

2 code implementations18 Nov 2021 Michael Matena, Colin Raffel

Computing a simple average of the models' parameters therefore corresponds to making an isotropic Gaussian approximation to their posteriors.

Domain Adaptation Transfer Learning

Do Transformer Modifications Transfer Across Implementations and Applications?

1 code implementation EMNLP 2021 Sharan Narang, Hyung Won Chung, Yi Tay, William Fedus, Thibault Fevry, Michael Matena, Karishma Malkan, Noah Fiedel, Noam Shazeer, Zhenzhong Lan, Yanqi Zhou, Wei Li, Nan Ding, Jake Marcus, Adam Roberts, Colin Raffel

The research community has proposed copious modifications to the Transformer architecture since it was introduced over three years ago, relatively few of which have seen widespread adoption.

Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer

51 code implementations arXiv 2019 Colin Raffel, Noam Shazeer, Adam Roberts, Katherine Lee, Sharan Narang, Michael Matena, Yanqi Zhou, Wei Li, Peter J. Liu

Transfer learning, where a model is first pre-trained on a data-rich task before being fine-tuned on a downstream task, has emerged as a powerful technique in natural language processing (NLP).

Answer Generation Common Sense Reasoning +11

Cannot find the paper you are looking for? You can Submit a new open access paper.