1 code implementation • 1 Jan 2024 • Guangji Bai, Zheng Chai, Chen Ling, Shiyu Wang, Jiaying Lu, Nan Zhang, Tingwei Shi, Ziyang Yu, Mengdan Zhu, Yifei Zhang, Carl Yang, Yue Cheng, Liang Zhao
We categorize methods based on their optimization focus: computational, memory, energy, financial, and network resources and their applicability across various stages of an LLM's lifecycle, including architecture design, pretraining, finetuning, and system design.
no code implementations • 25 Aug 2023 • Guangji Bai, Ziyang Yu, Zheng Chai, Yue Cheng, Liang Zhao
It utilizes an offline memory to cache historical information (e. g., node embedding) as an affordable approximation of the exact value and achieves high concurrency.
no code implementations • 31 May 2022 • Zheng Chai, Guangji Bai, Liang Zhao, Yue Cheng
Traditional sampling-based methods accelerate GNN training by dropping edges and nodes, which impairs the graph integrity and model performance.
no code implementations • 17 Sep 2021 • Meixiang Quan, Zheng Chai, Xiao Liu
Lines provide the significantly richer geometric structural information about the environment than points, so lines are widely used in recent Visual Odometry (VO) works.
no code implementations • 1 Sep 2021 • Yujing Chen, Zheng Chai, Yue Cheng, Huzefa Rangwala
We propose a novel approach, FedConD, to detect and deal with the concept drift on local devices and minimize the effect on the performance of models in asynchronous FL.
no code implementations • 10 Aug 2021 • Xiaopeng Bi, Yu Chen, Xinyang Liu, Dehao Zhang, Ran Yan, Zheng Chai, Haotian Zhang, Xiao Liu
This report describes Megvii-3D team's approach towards CVPR 2021 Image Matching Workshop.
no code implementations • 10 Aug 2021 • Xiaopeng Bi, Ran Yan, Zheng Chai, Haotian Zhang, Xiao Liu
This report describes Megvii-3D team's approach towards SimLocMatch Challenge @ CVPR 2021 Image Matching Workshop.
1 code implementation • 20 May 2021 • Junxiang Wang, Hongyi Li, Zheng Chai, Yongchao Wang, Yue Cheng, Liang Zhao
Theoretical convergence to a (quantized) stationary point of the pdADMM-G algorithm and the pdADMM-G-Q algorithm is provided with a sublinear convergence rate $o(1/k)$, where $k$ is the number of iterations.
1 code implementation • 1 Nov 2020 • Junxiang Wang, Zheng Chai, Yue Cheng, Liang Zhao
In this paper, we propose a novel parallel deep learning ADMM framework (pdADMM) to achieve layer parallelism: parameters in each layer of neural networks can be updated independently in parallel.
no code implementations • 12 Oct 2020 • Zheng Chai, Yujing Chen, Ali Anwar, Liang Zhao, Yue Cheng, Huzefa Rangwala
By bridging the synchronous and asynchronous training through tiering, FedAT minimizes the straggler effect with improved convergence speed and test accuracy.
1 code implementation • 9 Sep 2020 • Junxiang Wang, Zheng Chai, Yue Cheng, Liang Zhao
In this paper, we analyze the reason and propose to achieve a compelling trade-off between parallelism and accuracy by a reformulation called Tunable Subnetwork Splitting Method (TSSM), which can tune the decomposition granularity of deep neural networks.
no code implementations • 25 Jan 2020 • Zheng Chai, Ahsan Ali, Syed Zawad, Stacey Truex, Ali Anwar, Nathalie Baracaldo, Yi Zhou, Heiko Ludwig, Feng Yan, Yue Cheng
To this end, we propose TiFL, a Tier-based Federated Learning System, which divides clients into tiers based on their training performance and selects clients from the same tier in each training round to mitigate the straggler problem caused by heterogeneity in resource and data quantity.
no code implementations • 13 May 2019 • Yujing Chen, Yue Ning, Zheng Chai, Huzefa Rangwala
The attention mechanism of the proposed model seeks to extract feature representations from the input and learn a shared representation focused on time dimensions across multiple sensors.
1 code implementation • 8 Aug 2018 • Yue Cheng, Zheng Chai, Ali Anwar
Warehouse-scale cloud datacenters co-locate workloads with different and often complementary characteristics for improved resource utilization.
Distributed, Parallel, and Cluster Computing