1 code implementation • 13 Jul 2024 • Sukjun Hwang, Aakash Lahoti, Tri Dao, Albert Gu
We identify a key axis of matrix parameterizations termed sequence alignment, which increases the flexibility and performance of matrix mixers, providing insights into the strong performance of Transformers and recent SSMs such as Mamba.
no code implementations • 23 Mar 2024 • Aakash Lahoti, Stefani Karp, Ezra Winston, Aarti Singh, Yuanzhi Li
Vision tasks are characterized by the properties of locality and translation invariance.
1 code implementation • 26 May 2023 • Aakash Lahoti, Spandan Senapati, Ketan Rajawat, Alec Koppel
Specifically, they exhibit a superlinear rate with $O(d^2)$ cost in contrast to the linear rate of first-order methods with $O(d)$ cost and the quadratic rate of second-order methods with $O(d^3)$ cost.