Search Results for author: Zhuoyuan Huo

TransformerFAM: Feedback attention is working memory

While Transformers have revolutionized deep learning, their quadratic attention complexity hinders their ability to process infinitely long inputs.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.