no code implementations • 18 Oct 2023 • Yanming Kang, Giang Tran, Hans De Sterck
The overall complexity of Fast Multipole Attention is $\mathcal{O}(n)$ or $\mathcal{O}(n \log n)$, depending on whether the queries are down-sampled or not.
Language Modelling