Search Results for author: Xuekai Zhu

Found 4 papers, 2 papers with code

Critical Data Size of Language Models from a Grokking Perspective

no code implementations • 19 Jan 2024 • Xuekai Zhu, Yao Fu, BoWen Zhou, Zhouhan Lin

We formalize the phase transition under the grokking configuration into the Data Efficiency Hypothesis and identify data insufficiency, sufficiency, and surplus regimes in language models training dynamics.

Language Modelling Memorization

Paper
Add Code

CRaSh: Clustering, Removing, and Sharing Enhance Fine-tuning without Full Large Language Model

no code implementations • 24 Oct 2023 • Kaiyan Zhang, Ning Ding, Biqing Qi, Xuekai Zhu, Xinwei Long, BoWen Zhou

Instruction tuning has recently been recognized as an effective way of aligning Large Language Models (LLMs) to enhance their generalization ability across various tasks.

Clustering Language Modelling +1

Paper
Add Code

PaD: Program-aided Distillation Can Teach Small Models Reasoning Better than Chain-of-thought Fine-tuning

1 code implementation • 23 May 2023 • Xuekai Zhu, Biqing Qi, Kaiyan Zhang, Xinwei Long, Zhouhan Lin, BoWen Zhou

While large language models (LLMs) excel in various natural language processing tasks, their huge size and the inaccessibility of parameters present challenges for practical deployment.

Arithmetic Reasoning GSM8K +1

Paper
Code

StoryTrans: Non-Parallel Story Author-Style Transfer with Discourse Representations and Content Enhancing

1 code implementation • 29 Aug 2022 • Xuekai Zhu, Jian Guan, Minlie Huang, Juan Liu

Moreover, to enhance content preservation, we design a mask-and-fill framework to explicitly fuse style-specific keywords of source texts into generation.

Sentence Style Transfer +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.