Search Results for author: Jesper Anderson

Found 2 papers, 0 papers with code

Uncovering Layer-Dependent Activation Sparsity Patterns in ReLU Transformers

no code implementations10 Jul 2024 Cody Wild, Jesper Anderson

Previous work has demonstrated that MLPs within ReLU Transformers exhibit high levels of sparsity, with many of their activations equal to zero for any given token.

Cannot find the paper you are looking for? You can Submit a new open access paper.