Search Results for author: Wencong Xiao

Found 5 papers, 3 papers with code

Boosting Large-scale Parallel Training Efficiency with C4: A Communication-Driven Approach

no code implementations7 Jun 2024 Jianbo Dong, Bin Luo, Jun Zhang, Pengcheng Zhang, Fei Feng, Yikai Zhu, Ang Liu, Zian Chen, Yi Shi, Hairong Jiao, Gang Lu, Yu Guan, Ennan Zhai, Wencong Xiao, Hanyu Zhao, Man Yuan, Siran Yang, Xiang Li, Jiamang Wang, Rui Men, Jianwei Zhang, Huang Zhong, Dennis Cai, Yuan Xie, Binzhang Fu

By leveraging this feature, C4 can rapidly identify the faulty components, swiftly isolate the anomaly, and restart the task, thereby avoiding resource wastage caused by delays in anomaly detection.

Anomaly Detection

FusionAD: Multi-modality Fusion for Prediction and Planning Tasks of Autonomous Driving

1 code implementation2 Aug 2023 Tengju Ye, Wei Jing, Chunyong Hu, Shikun Huang, Lingping Gao, Fangzhen Li, Jingke Wang, Ke Guo, Wencong Xiao, Weibo Mao, Hang Zheng, Kun Li, Junbo Chen, Kaicheng Yu

Building a multi-modality multi-task neural network toward accurate and robust performance is a de-facto standard in perception task of autonomous driving.

Autonomous Driving

MIGPerf: A Comprehensive Benchmark for Deep Learning Training and Inference Workloads on Multi-Instance GPUs

1 code implementation1 Jan 2023 Huaizheng Zhang, Yuanming Li, Wencong Xiao, Yizheng Huang, Xing Di, Jianxiong Yin, Simon See, Yong Luo, Chiew Tong Lau, Yang You

The vision of this paper is to provide a more comprehensive and practical benchmark study for MIG in order to eliminate the need for tedious manual benchmarking and tuning efforts.


Analysis of Large-Scale Multi-Tenant GPU Clusters for DNN Training Workloads

1 code implementation17 Jan 2019 Myeongjae Jeon, Shivaram Venkataraman, Amar Phanishayee, Junjie Qian, Wencong Xiao, Fan Yang

With widespread advances in machine learning, a number of large enterprises are beginning to incorporate machine learning models across a number of products.

Distributed, Parallel, and Cluster Computing

Balanced Sparsity for Efficient DNN Inference on GPU

no code implementations1 Nov 2018 Zhuliang Yao, Shijie Cao, Wencong Xiao, Chen Zhang, Lanshun Nie

However, it requires the customization of hardwares to speed up practical inference.

Cannot find the paper you are looking for? You can Submit a new open access paper.