Search Results for author: Xiaonan Nie

Found 2 papers, 2 papers with code

Dense-to-Sparse Gate for Mixture-of-Experts

1 code implementation29 Dec 2021 Xiaonan Nie, Shijie Cao, Xupeng Miao, Lingxiao Ma, Jilong Xue, Youshan Miao, Zichao Yang, Zhi Yang, Bin Cui

However, we found that the current approach of jointly training experts and the sparse gate introduces a negative impact on model accuracy, diminishing the efficiency of expensive large-scale model training.

Cannot find the paper you are looking for? You can Submit a new open access paper.