no code implementations • 12 Nov 2024 • Bo Chen, Xiaoyu Li, YIngyu Liang, Jiangxuan Long, Zhenmei Shi, Zhao Song
In this work, we establish a tighter circuit complexity bound for Transformers with $\mathsf{RoPE}$ attention.
no code implementations • 15 Oct 2024 • YIngyu Liang, Jiangxuan Long, Zhenmei Shi, Zhao Song, Yufa Zhou
Large Language Models (LLMs) have shown immense potential in enhancing various aspects of our daily lives, from conversational AI to search and AI assistants.