Search Results for author: Xuekai Zhu

Found 4 papers, 2 papers with code

Critical Data Size of Language Models from a Grokking Perspective

no code implementations19 Jan 2024 Xuekai Zhu, Yao Fu, BoWen Zhou, Zhouhan Lin

We formalize the phase transition under the grokking configuration into the Data Efficiency Hypothesis and identify data insufficiency, sufficiency, and surplus regimes in language models training dynamics.

Language Modelling Memorization

CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model

no code implementations24 Oct 2023 Kaiyan Zhang, Ning Ding, Biqing Qi, Xuekai Zhu, Xinwei Long, BoWen Zhou

Instruction tuning has recently been recognized as an effective way of aligning Large Language Models (LLMs) to enhance their generalization ability across various tasks.

Clustering Language Modelling +1

PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning

1 code implementation23 May 2023 Xuekai Zhu, Biqing Qi, Kaiyan Zhang, Xinwei Long, Zhouhan Lin, BoWen Zhou

While large language models (LLMs) excel in various natural language processing tasks, their huge size and the inaccessibility of parameters present challenges for practical deployment.

Arithmetic Reasoning GSM8K +1

StoryTrans: Non-Parallel Story Author-Style Transfer with Discourse Representations and Content Enhancing

1 code implementation29 Aug 2022 Xuekai Zhu, Jian Guan, Minlie Huang, Juan Liu

Moreover, to enhance content preservation, we design a mask-and-fill framework to explicitly fuse style-specific keywords of source texts into generation.

Sentence Style Transfer +1

Cannot find the paper you are looking for? You can Submit a new open access paper.