no code implementations • 19 Dec 2024 • Lecheng Wang, Xianjie Shi, Ge Li, Jia Li, Yihong Dong, Xuanming Zhang, Wenpin Jiao, Hong Mei
We present a new finding: the performance of LMs gradually declines when trained on recursively generated text until they perform no better than a randomly initialized LM.
no code implementations • 15 Apr 2024 • Yang Lin, Xinyu Ma, Xu Chu, Yujie Jin, Zhibang Yang, Yasha Wang, Hong Mei
We then demonstrate the theoretical mechanism of our LoRA Dropout mechanism from the perspective of sparsity regularization by providing a generalization error bound under this framework.
no code implementations • 21 Mar 2024 • Yang Yao, Xin Wang, Zeyang Zhang, Yijian Qin, Ziwei Zhang, Xu Chu, Yuekui Yang, Wenwu Zhu, Hong Mei
In this paper, we propose LLM4GraphGen to explore the ability of LLMs for graph generation with systematical task designs and extensive experiments.
1 code implementation • 6 Sep 2023 • Yuqi Zhu, Ge Li, YunFei Zhao, Jia Li, Zhi Jin, Hong Mei
With an analysis of loss distributions of code tokens, we find that code tokens can be divided into two categories: challenging tokens that are difficult to predict and confident tokens that can be easily inferred.
no code implementations • 31 Dec 2021 • Xianglin Yang, Yun Lin, Ruofan Liu, Zhenfeng He, Chao Wang, Jin Song Dong, Hong Mei
Moreover, our case study shows that our visual solution can well reflect the characteristics of various training scenarios, showing good potential of DVI as a debugging tool for analyzing deep learning training processes.
no code implementations • 5 Feb 2021 • Wenjie Chu, Wei zhang, Haiyan Zhao, Zhi Jin, Hong Mei
Self-assembly plays an essential role in many natural processes, involving the formation and evolution of living or non-living structures, and shows potential applications in many emerging domains.
Multiagent Systems Distributed, Parallel, and Cluster Computing Robotics
no code implementations • 7 Dec 2020 • Yan Li, Bo An, Junming Ma, Donggang Cao, Yasha Wang, Hong Mei
Hyper-parameter tuning (HPT) is crucial for many machine learning (ML) algorithms.