no code implementations • 28 Feb 2022 • Zhaodong Chen, Yuying Quan, Zheng Qu, Liu Liu, Yufei Ding, Yuan Xie
We evaluate the 1:2 and 2:4 sparsity under different configurations and achieve 1. 27~ 1. 89x speedups over the full-attention mechanism.
no code implementations • 29 Sep 2021 • Zhaodong Chen, Liu Liu, Yuying Quan, Zheng Qu, Yufei Ding, Yuan Xie
Transformers are becoming mainstream solutions for various tasks like NLP and Computer vision.