1 code implementation • 16 Sep 2024 • Luning Wang, Shiyao Li, Xuefei Ning, Zhihang Yuan, Shengen Yan, Guohao Dai, Yu Wang
Therefore, we introduce CSKV, a training-efficient Channel Shrinking technique for KV cache compression: (1) We first analyze the singular value distribution of the KV cache, revealing significant redundancy and compression potential along the channel dimension.
no code implementations • 3 Aug 2024 • Liang-bo Ning, Zeyu Dai, Wenqi Fan, Jingran Su, Chao Pan, Luning Wang, Qing Li
Recent studies have shown that adversaries can manipulate the predictions of DNNs by adding a universal adversarial perturbation (UAP) to benign samples.
no code implementations • 22 Apr 2024 • Zixuan Zhou, Xuefei Ning, Ke Hong, Tianyu Fu, Jiaming Xu, Shiyao Li, Yuming Lou, Luning Wang, Zhihang Yuan, Xiuhong Li, Shengen Yan, Guohao Dai, Xiao-Ping Zhang, Yuhan Dong, Yu Wang
This paper presents a comprehensive survey of the existing literature on efficient LLM inference.
1 code implementation • 28 Feb 2024 • Shiyao Li, Xuefei Ning, Luning Wang, Tengxuan Liu, Xiangsheng Shi, Shengen Yan, Guohao Dai, Huazhong Yang, Yu Wang
Specifically, PTQ can effectively mitigate memory consumption and reduce computational overhead in LLMs.
no code implementations • 13 Jul 2023 • Nevin L. Zhang, Kaican Li, Han Gao, Weiyan Xie, Zhi Lin, Zhenguo Li, Luning Wang, Yongxiang Huang
Domain generalization (DG) is about learning models that generalize well to new domains that are related to, but different from, the training domain(s).
no code implementations • 20 May 2023 • Jindi Zhang, Luning Wang, Dan Su, Yongxiang Huang, Caleb Chen Cao, Lei Chen
Machine learning systems produce biased results towards certain demographic groups, known as the fairness problem.
1 code implementation • 13 May 2023 • Han Gao, Kaican Li, Weiyan Xie, Zhi Lin, Yongxiang Huang, Luning Wang, Caleb Chen Cao, Nevin L. Zhang
In this paper, we consider a third, lesser-known setting where a training domain is endowed with a collection of pairs of examples that share the same semantic information.