Search Results for author: Qingtian Feng

Found 1 papers, 0 papers with code

SwapMoE: Efficient Memory-Constrained Serving of Large Sparse MoE Models via Dynamic Expert Pruning and Swapping

no code implementations29 Aug 2023 Rui Kong, Yuanchun Li, Qingtian Feng, Weijun Wang, Linghe Kong, Yunxin Liu

The main idea of SwapMoE is to keep a small dynamic set of important experts, namely Virtual Experts, in the main memory for inference, while seamlessly maintaining how the Virtual Experts map to the actual experts.

object-detection Object Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.