Search Results for author: Weilin Cai

Found 1 papers, 0 papers with code

Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts

no code implementations • 7 Apr 2024 • Weilin Cai, Juyong Jiang, Le Qin, Junwei Cui, Sunghun Kim, Jiayi Huang

Expert parallelism has been introduced as a strategy to distribute the computational workload of sparsely-gated mixture-of-experts (MoE) models across multiple computing devices, facilitating the execution of these increasingly large-scale models.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.