1 code implementation • 16 Dec 2023 • Yuntao Gui, Xiao Yan, Peiqi Yin, Han Yang, James Cheng
Thus, we design the sparse MHA module, which computes and stores only large attention weights to reduce memory consumption, and the routed FFN module, which dynamically activates a subset of model parameters for each token to reduce computation cost.
no code implementations • Proceedings of the 2021 International Conference on Management of Data 2021 • Yidi Wu, Yuntao Gui, Tatiana Jin, James Cheng, Xiao Yan, Peiqi Yin, Yufei Cai, Bo Tang, Fan Yu
Graph neural networks (GNNs) have achieved remarkable performance in many graph analytics tasks such as node classification, link prediction and graph clustering.