Search Results for author: Youshan Miao

Found 10 papers, 2 papers with code

RPC Considered Harmful: Fast Distributed Deep Learning on RDMA

no code implementations • 22 May 2018 • Jilong Xue, Youshan Miao, Cheng Chen, Ming Wu, Lintao Zhang, Lidong Zhou

Its computation is typically characterized by a simple tensor data abstraction to model multi-dimensional matrices, a data-flow graph to model computation, and iterative executions with relatively frequent synchronizations, thereby making it substantially different from Map/Reduce style distributed big data computation.

Paper
Add Code

Towards Efficient Large-Scale Graph Neural Network Computing

no code implementations • 19 Oct 2018 • Lingxiao Ma, Zhi Yang, Youshan Miao, Jilong Xue, Ming Wu, Lidong Zhou, Yafei Dai

This evolution has led to large graph-based irregular and sparse models that go beyond what existing deep learning frameworks are designed for.

graph partitioning Knowledge Graphs

Paper
Add Code

Architectural Implications of Graph Neural Networks

no code implementations • 2 Sep 2020 • Zhihui Zhang, Jingwen Leng, Lingxiao Ma, Youshan Miao, Chao Li, Minyi Guo

Graph neural networks (GNN) represent an emerging line of deep learning models that operate on graph structures.

Paper
Add Code

PaGraph: Scaling GNN Training on Large Graphs via Computation-aware Caching and Partitioning

no code implementations • Proceedings of the 11th ACM Symposium on Cloud Computing 2020 • Zhiqi Lin, Cheng Li, Youshan Miao, Yunxin Liu, Yinlong Xu

Emerging graph neural networks (GNNs) have extended the successes of deep learning techniques against datasets like images and texts to more complex graph-structured data.

Paper
Add Code

CrossoverScheduler: Overlapping Multiple Distributed Training Applications in a Crossover Manner

no code implementations • 14 Mar 2021 • Cheng Luo, Lei Qu, Youshan Miao, Peng Cheng, Yongqiang Xiong

Distributed deep learning workloads include throughput-intensive training tasks on the GPU clusters, where the Distributed Stochastic Gradient Descent (SGD) incurs significant communication delays after backward propagation, forces workers to wait for the gradient synchronization via a centralized parameter server or directly in decentralized workers.

Image Classification

Paper
Add Code

Breaking the Computation and Communication Abstraction Barrier in Distributed Machine Learning Workloads

2 code implementations • 12 May 2021 • Abhinav Jangda, Jun Huang, Guodong Liu, Amir Hossein Nodehi Sabet, Saeed Maleki, Youshan Miao, Madanlal Musuvathi, Todd Mytkowicz, Olli Sarikivi

Therefore, we present CoCoNeT, with a DSL to express a program with both computation and communication.

BIG-bench Machine Learning

243

Paper
Code

EvoMoE: An Evolutional Mixture-of-Experts Training Framework via Dense-To-Sparse Gate

2 code implementations • 29 Dec 2021 • Xiaonan Nie, Xupeng Miao, Shijie Cao, Lingxiao Ma, Qibin Liu, Jilong Xue, Youshan Miao, Yi Liu, Zhi Yang, Bin Cui

Then it diversifies the experts and continues to train the MoE with a novel Dense-to-Sparse gate (DTS-Gate).

Language Modelling Machine Translation +1

Paper
Code

SuperScaler: Supporting Flexible DNN Parallelization via a Unified Abstraction

no code implementations • 21 Jan 2023 • Zhiqi Lin, Youshan Miao, Guodong Liu, Xiaoxiang Shi, Quanlu Zhang, Fan Yang, Saeed Maleki, Yi Zhu, Xu Cao, Cheng Li, Mao Yang, Lintao Zhang, Lidong Zhou

SuperScaler is a system that facilitates the design and generation of highly flexible parallelization plans.

Scheduling

Paper
Add Code

Adam Accumulation to Reduce Memory Footprints of both Activations and Gradients for Large-scale DNN Training

no code implementations • 31 May 2023 • Yijia Zhang, Yibo Han, Shijie Cao, Guohao Dai, Youshan Miao, Ting Cao, Fan Yang, Ningyi Xu

We find that previous gradient accumulation reduces activation memory but fails to be compatible with gradient memory reduction due to a contradiction between preserving gradients and releasing gradients.

Paper
Add Code

Tessel: Boosting Distributed Execution of Large DNN Models via Flexible Schedule Search

no code implementations • 26 Nov 2023 • Zhiqi Lin, Youshan Miao, Guanbin Xu, Cheng Li, Olli Saarikivi, Saeed Maleki, Fan Yang

This paper presents Tessel, an automated system that searches for efficient schedules for distributed DNN training and inference for diverse operator placement strategies.

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.