Search Results for author: Yue Cheng

Found 15 papers, 8 papers with code

Beyond Efficiency: A Systematic Survey of Resource-Efficient Large Language Models

1 code implementation1 Jan 2024 Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang, Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, Carl Yang, Yue Cheng, Liang Zhao

We categorize methods based on their optimization focus: computational, memory, energy, financial, and network resources and their applicability across various stages of an LLM's lifecycle, including architecture design, pretraining, finetuning, and system design.

Staleness-Alleviated Distributed GNN Training via Online Dynamic-Embedding Prediction

no code implementations25 Aug 2023 Guangji Bai, Ziyang Yu, Zheng Chai, Yue Cheng, Liang Zhao

It utilizes an offline memory to cache historical information (e. g., node embedding) as an affordable approximation of the exact value and achieves high concurrency.

Distributed Computing

Distributed Graph Neural Network Training with Periodic Stale Representation Synchronization

no code implementations31 May 2022 Zheng Chai, Guangji Bai, Liang Zhao, Yue Cheng

Traditional sampling-based methods accelerate GNN training by dropping edges and nodes, which impairs the graph integrity and model performance.

Graph Embedding Knowledge Graphs +1

Towards cost-effective and resource-aware aggregation at Edge for Federated Learning

no code implementations16 Apr 2022 Ahmad Faraz Khan, Yuze Li, Xinran Wang, Sabaat Haroon, Haider Ali, Yue Cheng, Ali R. Butt, Ali Anwar

Federated Learning (FL) is a machine learning approach that addresses privacy and data transfer costs by computing data at the source.

Federated Learning

Asynchronous Federated Learning for Sensor Data with Concept Drift

no code implementations1 Sep 2021 Yujing Chen, Zheng Chai, Yue Cheng, Huzefa Rangwala

We propose a novel approach, FedConD, to detect and deal with the concept drift on local devices and minimize the effect on the performance of models in asynchronous FL.

Ensemble Learning Federated Learning

Towards Quantized Model Parallelism for Graph-Augmented MLPs Based on Gradient-Free ADMM Framework

1 code implementation20 May 2021 Junxiang Wang, Hongyi Li, Zheng Chai, Yongchao Wang, Yue Cheng, Liang Zhao

Theoretical convergence to a (quantized) stationary point of the pdADMM-G algorithm and the pdADMM-G-Q algorithm is provided with a sublinear convergence rate $o(1/k)$, where $k$ is the number of iterations.

Quantization

pdADMM: parallel deep learning Alternating Direction Method of Multipliers

1 code implementation1 Nov 2020 Junxiang Wang, Zheng Chai, Yue Cheng, Liang Zhao

In this paper, we propose a novel parallel deep learning ADMM framework (pdADMM) to achieve layer parallelism: parameters in each layer of neural networks can be updated independently in parallel.

Wukong: A Scalable and Locality-Enhanced Framework for Serverless Parallel Computing

4 code implementations14 Oct 2020 Benjamin Carver, Jingyuan Zhang, Ao Wang, Ali Anwar, Panruo Wu, Yue Cheng

Serverless computing is increasingly being used for parallel computing, which have traditionally been implemented as stateful applications.

Distributed, Parallel, and Cluster Computing

FedAT: A High-Performance and Communication-Efficient Federated Learning System with Asynchronous Tiers

no code implementations12 Oct 2020 Zheng Chai, Yujing Chen, Ali Anwar, Liang Zhao, Yue Cheng, Huzefa Rangwala

By bridging the synchronous and asynchronous training through tiering, FedAT minimizes the straggler effect with improved convergence speed and test accuracy.

Federated Learning

Tunable Subnetwork Splitting for Model-parallelism of Neural Network Training

1 code implementation9 Sep 2020 Junxiang Wang, Zheng Chai, Yue Cheng, Liang Zhao

In this paper, we analyze the reason and propose to achieve a compelling trade-off between parallelism and accuracy by a reformulation called Tunable Subnetwork Splitting Method (TSSM), which can tune the decomposition granularity of deep neural networks.

TiFL: A Tier-based Federated Learning System

no code implementations25 Jan 2020 Zheng Chai, Ahsan Ali, Syed Zawad, Stacey Truex, Ali Anwar, Nathalie Baracaldo, Yi Zhou, Heiko Ludwig, Feng Yan, Yue Cheng

To this end, we propose TiFL, a Tier-based Federated Learning System, which divides clients into tiers based on their training performance and selects clients from the same tier in each training round to mitigate the straggler problem caused by heterogeneity in resource and data quantity.

Federated Learning

In Search of a Fast and Efficient Serverless DAG Engine

2 code implementations14 Oct 2019 Benjamin Carver, Jingyuan Zhang, Ao Wang, Yue Cheng

The auto-scaling property of serverless computing platforms accommodates short tasks and bursty workloads, while the pay-per-use billing model of serverless computing providers keeps the cost of short tasks low.

Distributed, Parallel, and Cluster Computing

Characterizing Co-located Datacenter Workloads: An Alibaba Case Study

1 code implementation8 Aug 2018 Yue Cheng, Zheng Chai, Ali Anwar

Warehouse-scale cloud datacenters co-locate workloads with different and often complementary characteristics for improved resource utilization.

Distributed, Parallel, and Cluster Computing

Cannot find the paper you are looking for? You can Submit a new open access paper.