1 code implementation • 28 Mar 2024 • Ang Lv, Kaiyi Zhang, Yuhan Chen, Yulong Wang, Lifeng Liu, Ji-Rong Wen, Jian Xie, Rui Yan
In this paper, we deeply explore the mechanisms employed by Transformer-based language models in factual recall tasks.
1 code implementation • 12 Jan 2024 • Kaiyi Zhang, Ang Lv, Yuhan Chen, Hansen Ha, Tao Xu, Rui Yan
In this paper, by treating in-context learning (ICL) as a meta-optimization process, we explain why LLMs are sensitive to the order of ICL examples.
1 code implementation • 19 Dec 2023 • Kaiyi Zhang, Yang Chen, Ximing Yang, Weizhong Zhang, Cheng Jin
Based on this process, we introduce SGAS, a model for part editing that employs two strategies: feature disentanglement and constraint.
1 code implementation • 13 Nov 2023 • Ang Lv, Kaiyi Zhang, Shufang Xie, Quan Tu, Yuhan Chen, Ji-Rong Wen, Rui Yan
Recent studies have highlighted a phenomenon in large language models (LLMs) known as "the reversal curse," in which the order of knowledge entities in the training data biases the models' comprehension.
1 code implementation • 7 Mar 2022 • Nathaniel Lahn, Sharath Raghvendra, Kaiyi Zhang
Interestingly, unlike the Sinkhorn algorithm, our method also readily provides a compact transport plan as well as a solution to an approximate version of the dual formulation of the OT problem, both of which have numerous applications in Machine Learning.
no code implementations • 6 Feb 2022 • Kaiyi Zhang, Ximing Yang, Yuan Wu, Cheng Jin
Besides, the missing patterns are diverse in reality, but existing methods can only handle fixed ones, which means a poor generalization ability.
1 code implementation • 10 Dec 2021 • Kaiyi Zhang, Ximing Yang, Yuan Wu, Cheng Jin
The points generated by AXform do not have the strong 2-manifold constraint, which improves the generation of non-smooth surfaces.