1 code implementation • 12 Nov 2023 • Chenhe Dong, Yuexiang Xie, Bolin Ding, Ying Shen, Yaliang Li
As the global model itself is not required to be shared and the local training is conducted based on an auxiliary model with fewer parameters than the global model, the proposed approach provides protection for the global model while reducing communication and computation costs in FL.
no code implementations • 18 May 2023 • Chenhe Dong, Yuexiang Xie, Yaliang Li, Ying Shen
Despite substantial progress in abstractive text summarization to generate fluent and informative texts, the factual inconsistency in the generated summaries remains an important yet challenging problem to be solved.
1 code implementation • 12 Dec 2022 • Chenhe Dong, Yuexiang Xie, Bolin Ding, Ying Shen, Yaliang Li
In this study, we further broaden the application scope of FL in NLP by proposing an Assign-Then-Contrast (denoted as ATC) framework, which enables clients with heterogeneous NLP tasks to construct an FL course and learn useful knowledge from each other.
1 code implementation • 7 Jun 2022 • Liuyi Yao, Dawei Gao, Zhen Wang, Yuexiang Xie, Weirui Kuang, Daoyuan Chen, Haohui Wang, Chenhe Dong, Bolin Ding, Yaliang Li
To investigate the heterogeneity in federated learning in real-world scenarios, we generalize the classic federated learning to federated hetero-task learning, which emphasizes the inconsistency across the participants in federated learning in terms of both data distribution and learning tasks.
no code implementations • 22 Dec 2021 • Chenhe Dong, Yinghui Li, Haifan Gong, Miaoxin Chen, Junxin Li, Ying Shen, Min Yang
This paper offers a comprehensive review of the research on Natural Language Generation (NLG) over the past two decades, especially in relation to data-to-text generation and text-to-text generation deep learning methods, as well as new applications of NLG technology.
1 code implementation • EMNLP 2021 • Chenhe Dong, Yaliang Li, Ying Shen, Minghui Qiu
In this paper, we target to compress PLMs with knowledge distillation, and propose a hierarchical relational knowledge distillation (HRKD) method to capture both hierarchical and domain relational information.
1 code implementation • Findings (EMNLP) 2021 • Chenhe Dong, Guangrun Wang, Hang Xu, Jiefeng Peng, Xiaozhe Ren, Xiaodan Liang
In this paper, we have a critical insight that improving the feed-forward network (FFN) in BERT has a higher gain than improving the multi-head attention (MHA) since the computational cost of FFN is 2$\sim$3 times larger than MHA.