Search Results for author: Sun Ao

Found 2 papers, 2 papers with code

BurstAttention: An Efficient Distributed Attention Framework for Extremely Long Sequences

1 code implementation14 Mar 2024 Sun Ao, Weilin Zhao, Xu Han, Cheng Yang, Zhiyuan Liu, Chuan Shi, Maosong Sun, Shengnan Wang, Teng Su

Effective attention modules have played a crucial role in the success of Transformer-based large language models (LLMs), but the quadratic time and memory complexities of these attention modules also pose a challenge when processing long sequences.

Cannot find the paper you are looking for? You can Submit a new open access paper.