Search Results for author: Youshan Miao

Found 7 papers, 2 papers with code

Dense-to-Sparse Gate for Mixture-of-Experts

1 code implementation29 Dec 2021 Xiaonan Nie, Shijie Cao, Xupeng Miao, Lingxiao Ma, Jilong Xue, Youshan Miao, Zichao Yang, Zhi Yang, Bin Cui

However, we found that the current approach of jointly training experts and the sparse gate introduces a negative impact on model accuracy, diminishing the efficiency of expensive large-scale model training.

CrossoverScheduler: Overlapping Multiple Distributed Training Applications in a Crossover Manner

no code implementations14 Mar 2021 Cheng Luo, Lei Qu, Youshan Miao, Peng Cheng, Yongqiang Xiong

Distributed deep learning workloads include throughput-intensive training tasks on the GPU clusters, where the Distributed Stochastic Gradient Descent (SGD) incurs significant communication delays after backward propagation, forces workers to wait for the gradient synchronization via a centralized parameter server or directly in decentralized workers.

Image Classification

PaGraph: Scaling GNN Training on Large Graphs via Computation-aware Caching and Partitioning

no code implementations Proceedings of the 11th ACM Symposium on Cloud Computing 2020 Zhiqi Lin, Cheng Li, Youshan Miao, Yunxin Liu, Yinlong Xu

Emerging graph neural networks (GNNs) have extended the successes of deep learning techniques against datasets like images and texts to more complex graph-structured data.

Architectural Implications of Graph Neural Networks

no code implementations2 Sep 2020 Zhihui Zhang, Jingwen Leng, Lingxiao Ma, Youshan Miao, Chao Li, Minyi Guo

Graph neural networks (GNN) represent an emerging line of deep learning models that operate on graph structures.

Towards Efficient Large-Scale Graph Neural Network Computing

no code implementations19 Oct 2018 Lingxiao Ma, Zhi Yang, Youshan Miao, Jilong Xue, Ming Wu, Lidong Zhou, Yafei Dai

This evolution has led to large graph-based irregular and sparse models that go beyond what existing deep learning frameworks are designed for.

graph partitioning Knowledge Graphs

RPC Considered Harmful: Fast Distributed Deep Learning on RDMA

no code implementations22 May 2018 Jilong Xue, Youshan Miao, Cheng Chen, Ming Wu, Lintao Zhang, Lidong Zhou

Its computation is typically characterized by a simple tensor data abstraction to model multi-dimensional matrices, a data-flow graph to model computation, and iterative executions with relatively frequent synchronizations, thereby making it substantially different from Map/Reduce style distributed big data computation.

Cannot find the paper you are looking for? You can Submit a new open access paper.