Search Results for author: Zhiquan Lai

Found 7 papers, 4 papers with code

Accurate and Efficient Fine-Tuning of Quantized Large Language Models Through Optimal Balance

1 code implementation24 Jul 2024 Ao Shen, Qiang Wang, Zhiquan Lai, Xionglve Li, Dongsheng Li

We propose Quantized LLMs with Balanced-rank Adaptation (Q-BaRA), which simplifies the adapter inputs and outputs while increasing the adapter's rank to achieve a more suitable balance for fine-tuning quantized LLMs.

Quantization

Merak: An Efficient Distributed DNN Training Framework with Automated 3D Parallelism for Giant Foundation Models

1 code implementation10 Jun 2022 Zhiquan Lai, Shengwei Li, Xudong Tang, Keshi Ge, Weijie Liu, Yabo Duan, Linbo Qiao, Dongsheng Li

These features make it necessary to apply 3D parallelism, which integrates data parallelism, pipeline model parallelism and tensor model parallelism, to achieve high training efficiency.

DELTA: Dynamically Optimizing GPU Memory beyond Tensor Recomputation

1 code implementation30 Mar 2022 Yu Tang, Chenyu Wang, Yufan Zhang, Yuliang Liu, Xingcheng Zhang, Linbo Qiao, Zhiquan Lai, Dongsheng Li

To the best of our knowledge, we are the first to make a reasonable dynamic runtime scheduler on the combination of tensor swapping and tensor recomputation without user oversight.

EmbRace: Accelerating Sparse Communication for Distributed Training of NLP Neural Networks

no code implementations18 Oct 2021 Shengwei Li, Zhiquan Lai, Dongsheng Li, Yiming Zhang, Xiangyu Ye, Yabo Duan

EmbRace introduces Sparsity-aware Hybrid Communication, which integrates AlltoAll and model parallelism into data-parallel training, so as to reduce the communication overhead of highly sparse parameters.

image-classification Image Classification +1

S2 Reducer: High-Performance Sparse Communication to Accelerate Distributed Deep Learning

no code implementations5 Oct 2021 Keshi Ge, Yongquan Fu, Zhiquan Lai, Xiaoge Deng, Dongsheng Li

Distributed stochastic gradient descent (SGD) approach has been widely used in large-scale deep learning, and the gradient collective method is vital to ensure the training scalability of the distributed deep learning system.

Deep Learning Vocal Bursts Intensity Prediction

Hierarchical Adaptive Pooling by Capturing High-order Dependency for Graph Representation Learning

no code implementations13 Apr 2021 Ning Liu, Songlei Jian, Dongsheng Li, Yiming Zhang, Zhiquan Lai, Hongzuo Xu

Graph neural networks (GNN) have been proven to be mature enough for handling graph-structured data on node-level graph representation learning tasks.

Graph Classification Graph Matching +2

Cannot find the paper you are looking for? You can Submit a new open access paper.