1 code implementation • 7 Apr 2024 • ZiHao Wang, Bin Cui, Shaoduo Gan
In this work, we found that by identifying the importance of attention layers, we could optimize the KV-cache jointly from two dimensions, i. e., sequence-wise and layer-wise.
no code implementations • COLING 2022 • Bin Ji, Shasha Li, Shaoduo Gan, Jie Yu, Jun Ma, Huijun Liu
Few-shot named entity recognition (NER) enables us to build a NER system for a new domain using very few labeled examples.
1 code implementation • 12 Jun 2022 • Lijie Xu, Shuang Qiu, Binhang Yuan, Jiawei Jiang, Cedric Renggli, Shaoduo Gan, Kaan Kara, Guoliang Li, Ji Liu, Wentao Wu, Jieping Ye, Ce Zhang
In this paper, we first conduct a systematic empirical study on existing data shuffling strategies, which reveals that all existing strategies have room for improvement -- they all suffer in terms of I/O performance or convergence rate.
1 code implementation • 3 Jul 2021 • Shaoduo Gan, Xiangru Lian, Rui Wang, Jianbin Chang, Chengjun Liu, Hongmei Shi, Shengzhuo Zhang, Xianghong Li, Tengxu Sun, Jiawei Jiang, Binhang Yuan, Sen yang, Ji Liu, Ce Zhang
Recent years have witnessed a growing list of systems for distributed data-parallel training.
1 code implementation • 17 May 2021 • Jiawei Jiang, Shaoduo Gan, Yue Liu, Fanlin Wang, Gustavo Alonso, Ana Klimovic, Ankit Singla, Wentao Wu, Ce Zhang
The appeal of serverless (FaaS) has triggered a growing interest on how to use it in data-intensive applications such as ETL, query processing, or machine learning (ML).
2 code implementations • 4 Feb 2021 • Hanlin Tang, Shaoduo Gan, Ammar Ahmad Awan, Samyam Rajbhandari, Conglong Li, Xiangru Lian, Ji Liu, Ce Zhang, Yuxiong He
One of the most effective methods is error-compensated compression, which offers robust convergence speed even under 1-bit compression.
no code implementations • 1 Jan 2021 • Akhil Mathur, Shaoduo Gan, Anton Isopoussu, Fahim Kawsar, Nadia Berthouze, Nicholas Donald Lane
Breakthroughs in unsupervised domain adaptation (uDA) have opened up the possibility of adapting models from a label-rich source domain to unlabeled target domains.
no code implementations • 26 Aug 2020 • Hanlin Tang, Shaoduo Gan, Samyam Rajbhandari, Xiangru Lian, Ji Liu, Yuxiong He, Ce Zhang
Adam is the important optimization algorithm to guarantee efficiency and accuracy for training many important tasks such as BERT and ImageNet.
no code implementations • 25 Sep 2019 • Akhil Mathur, Shaoduo Gan, Anton Isopoussu, Fahim Kawsar, Nadia Berthouze, Nicholas D. Lane
Despite the recent breakthroughs in unsupervised domain adaptation (uDA), no prior work has studied the challenges of applying these methods in practical machine learning scenarios.
no code implementations • NeurIPS 2018 • Hanlin Tang, Shaoduo Gan, Ce Zhang, Tong Zhang, Ji Liu
In this paper, We explore a natural question: {\em can the combination of both techniques lead to a system that is robust to both bandwidth and latency?}