Search Results for author: Yongqiang Xiong

Found 5 papers, 1 papers with code

An Adaptive Deep RL Method for Non-Stationary Environments with Piecewise Stable Context

no code implementations24 Dec 2022 Xiaoyu Chen, Xiangming Zhu, Yufeng Zheng, Pushi Zhang, Li Zhao, Wenxue Cheng, Peng Cheng, Yongqiang Xiong, Tao Qin, Jianyu Chen, Tie-Yan Liu

One of the key challenges in deploying RL to real-world applications is to adapt to variations of unknown environment contexts, such as changing terrains in robotic tasks and fluctuated bandwidth in congestion control.

Tutel: Adaptive Mixture-of-Experts at Scale

2 code implementations7 Jun 2022 Changho Hwang, Wei Cui, Yifan Xiong, Ziyue Yang, Ze Liu, Han Hu, Zilong Wang, Rafael Salas, Jithin Jose, Prabhat Ram, Joe Chau, Peng Cheng, Fan Yang, Mao Yang, Yongqiang Xiong

On efficiency, Flex accelerates SwinV2-MoE, achieving up to 1. 55x and 2. 11x speedup in training and inference over Fairseq, respectively.

Object Detection

CrossoverScheduler: Overlapping Multiple Distributed Training Applications in a Crossover Manner

no code implementations14 Mar 2021 Cheng Luo, Lei Qu, Youshan Miao, Peng Cheng, Yongqiang Xiong

Distributed deep learning workloads include throughput-intensive training tasks on the GPU clusters, where the Distributed Stochastic Gradient Descent (SGD) incurs significant communication delays after backward propagation, forces workers to wait for the gradient synchronization via a centralized parameter server or directly in decentralized workers.

Image Classification

Simulating Performance of ML Systems with Offline Profiling

no code implementations17 Feb 2020 Hongming Huang, Peng Cheng, Hong Xu, Yongqiang Xiong

We advocate that simulation based on offline profiling is a promising approach to better understand and improve the complex ML systems.

Stanza: Layer Separation for Distributed Training in Deep Learning

no code implementations27 Dec 2018 Xiaorui Wu, Hong Xu, Bo Li, Yongqiang Xiong

Thus, we propose layer separation in distributed training: the majority of the nodes just train the convolutional layers, and the rest train the fully connected layers only.

Cannot find the paper you are looking for? You can Submit a new open access paper.