Search Results for author: Chaitanya Baranwal

Found 1 papers, 0 papers with code

Sequence Parallelism: Long Sequence Training from System Perspective

no code implementations26 May 2021 Shenggui Li, Fuzhao Xue, Chaitanya Baranwal, Yongbin Li, Yang You

That is, with sparse attention, our sequence parallelism enables us to train transformer with infinite long sequence.

Cannot find the paper you are looking for? You can Submit a new open access paper.