Search Results for author: Le Qin

Found 1 papers, 0 papers with code

Shortcut-connected Expert Parallelism for Accelerating Mixture-of-Experts

no code implementations7 Apr 2024 Weilin Cai, Juyong Jiang, Le Qin, Junwei Cui, Sunghun Kim, Jiayi Huang

Expert parallelism has been introduced as a strategy to distribute the computational workload of sparsely-gated mixture-of-experts (MoE) models across multiple computing devices, facilitating the execution of these increasingly large-scale models.

Cannot find the paper you are looking for? You can Submit a new open access paper.