Search Results for author: Qi Hou

Found 5 papers, 3 papers with code

Comet: Fine-grained Computation-communication Overlapping for Mixture-of-Experts

1 code implementation27 Feb 2025 Shulai Zhang, Ningxin Zheng, Haibin Lin, Ziheng Jiang, Wenlei Bao, Chengquan Jiang, Qi Hou, Weihao Cui, Size Zheng, Li-Wen Chang, Quan Chen, Xin Liu

The inter-device communication of a MoE layer can occupy 47% time of the entire model execution with popular models and frameworks.

Computational Efficiency

FLUX: Fast Software-based Communication Overlap On GPUs Through Kernel Fusion

1 code implementation11 Jun 2024 Li-Wen Chang, Wenlei Bao, Qi Hou, Chengquan Jiang, Ningxin Zheng, Yinmin Zhong, Xuanrun Zhang, Zuquan Song, Chengji Yao, Ziheng Jiang, Haibin Lin, Xin Jin, Xin Liu

Overall, it can achieve up to 1. 24x speedups for training over Megatron-LM on a cluster of 128 GPUs with various GPU generations and interconnects, and up to 1. 66x and 1. 30x speedups for prefill and decoding inference over vLLM on a cluster with 8 GPUs with various GPU generations and interconnects.

The Automatic Identification of Butterfly Species

no code implementations18 Mar 2018 Juanying Xie, Qi Hou, Yinghuan Shi, Lv Peng, Liping Jing, Fuzhen Zhuang, Junping Zhang, Xiaoyang Tang, Shengquan Xu

We delete those species with only one living environment image from data set, then partition the rest images from living environment into two subsets, one used as test subset, the other as training subset respectively combined with all standard pattern butterfly images or the standard pattern butterfly images with the same species of the images from living environment.

Cannot find the paper you are looking for? You can Submit a new open access paper.