1 code implementation • 17 Mar 2025 • Kairong Luo, Haodong Wen, Shengding Hu, Zhenbo Sun, Zhiyuan Liu, Maosong Sun, Kaifeng Lyu, WenGuang Chen
Training large models is both resource-intensive and time-consuming, making it crucial to understand the quantitative relationship between model performance and hyperparameters.
no code implementations • 1 Jan 2025 • Zhenyu Guo, WenGuang Chen
Transformers have achieved remarkable success across diverse domains, but their monolithic architecture presents challenges in interpretability, adaptability, and scalability.
no code implementations • 9 Dec 2024 • Boyang Zhang, Daning Cheng, Yunquan Zhang, Fangmin Liu, WenGuang Chen
A key challenge is effectively leveraging compression errors and defining the boundaries for lossless compression to minimize model loss.
1 code implementation • 10 Sep 2024 • Lei Liang, Mengshu Sun, Zhengke Gui, Zhongshu Zhu, Zhouyu Jiang, Ling Zhong, Yuan Qu, Peilong Zhao, Zhongpu Bo, Jin Yang, Huaidong Xiong, Lin Yuan, Jun Xu, Zaoyang Wang, Zhiqiang Zhang, Wen Zhang, Huajun Chen, WenGuang Chen, Jun Zhou
The recently developed retrieval-augmented generation (RAG) technology has enabled the efficient construction of domain-specific applications.
no code implementations • 10 Nov 2022 • Haiyang Lin, Mingyu Yan, Xiaochun Ye, Dongrui Fan, Shirui Pan, WenGuang Chen, Yuan Xie
This situation poses a considerable challenge for newcomers, hindering their ability to grasp a comprehensive understanding of the workflows, computational patterns, communication strategies, and optimization techniques employed in distributed GNN training.
9 code implementations • 5 Oct 2022 • Aohan Zeng, Xiao Liu, Zhengxiao Du, Zihan Wang, Hanyu Lai, Ming Ding, Zhuoyi Yang, Yifan Xu, Wendi Zheng, Xiao Xia, Weng Lam Tam, Zixuan Ma, Yufei Xue, Jidong Zhai, WenGuang Chen, Peng Zhang, Yuxiao Dong, Jie Tang
We introduce GLM-130B, a bilingual (English and Chinese) pre-trained language model with 130 billion parameters.
Ranked #1 on
Language Modelling
on CLUE (OCNLI_50K)
no code implementations • 20 Jul 2022 • Daning Cheng, WenGuang Chen
Based on the model's resilience to computational noise, model quantization is important for compressing models and improving computing speed.
no code implementations • 10 Feb 2022 • Daning Cheng, WenGuang Chen
In this paper, we will show that the quantization in layer's input is more important than parameters' quantization for loss function.
1 code implementation • 21 Apr 2021 • Yongchao Liu, Houyi Li, Guowei Zhang, Xintan Zeng, Yongyong Li, Bin Huang, Peng Zhang, Zhao Li, Xiaowei Zhu, Changhua He, WenGuang Chen
Herein, we present GraphTheta, the first distributed and scalable graph learning system built upon vertex-centric distributed graph processing with neural network operators implemented as user-defined functions.
1 code implementation • 17 Aug 2020 • Zhixiang Ren, Yongheng Liu, Tianhui Shi, Lei Xie, Yue Zhou, Jidong Zhai, Youhui Zhang, Yunquan Zhang, WenGuang Chen
The de facto HPC benchmark LINPACK can not reflect AI computing power and I/O performance without representative workload.
no code implementations • 15 Nov 2017 • Yu Ji, Youhui Zhang, WenGuang Chen, Yuan Xie
Different from developing neural networks (NNs) for general-purpose processors, the development for NN chips usually faces with some hardware-specific restrictions, such as limited precision of network signals and parameters, constrained computation scale, and limited types of non-linear functions.
no code implementations • 8 Oct 2016 • Kaiwei Li, Jianfei Chen, WenGuang Chen, Jun Zhu
Latent Dirichlet Allocation (LDA) is a popular tool for analyzing discrete count data such as text and images.
no code implementations • 29 Oct 2015 • Jianfei Chen, Kaiwei Li, Jun Zhu, WenGuang Chen
We then develop WarpLDA, an LDA sampler which achieves both the best O(1) time complexity per token and the best O(K) scope of random access.