Search Results for author: Thomas Marshall

Found 1 papers, 0 papers with code

Does Transformer Interpretability Transfer to RNNs?

no code implementations • 9 Apr 2024 • Gonçalo Paulo, Thomas Marshall, Nora Belrose

Recent advances in recurrent neural network architectures, such as Mamba and RWKV, have enabled RNNs to match or exceed the performance of equal-size transformers in terms of language modeling perplexity and downstream evaluations, suggesting that future systems may be built on completely new architectures.

Language Modelling

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.