Search Results for author: Hong Mei

Found 8 papers, 2 papers with code

FANformer: Improving Large Language Models Through Effective Periodicity Modeling

1 code implementation28 Feb 2025 Yihong Dong, Ge Li, Xue Jiang, Yongding Tao, Kechi Zhang, Hao Zhu, Huanyu Liu, Jiazheng Ding, Jia Li, Jinliang Deng, Hong Mei

Our pretrained FANformer-1B exhibits marked improvements on downstream tasks compared to open-source LLMs with similar model parameters or training tokens.

Theoretical Proof that Auto-regressive Language Models Collapse when Real-world Data is a Finite Set

no code implementations19 Dec 2024 Lecheng Wang, Xianjie Shi, Ge Li, Jia Li, Xuanming Zhang, Yihong Dong, Wenpin Jiao, Hong Mei

Auto-regressive language models (LMs) have been widely used to generate data in data-scarce domains to train new LMs, compensating for the scarcity of real-world data.

LoRA Dropout as a Sparsity Regularizer for Overfitting Control

no code implementations15 Apr 2024 Yang Lin, Xinyu Ma, Xu Chu, Yujie Jin, Zhibang Yang, Yasha Wang, Hong Mei

We then demonstrate the theoretical mechanism of our LoRA Dropout mechanism from the perspective of sparsity regularization by providing a generalization error bound under this framework.

parameter-efficient fine-tuning

Exploring the Potential of Large Language Models in Graph Generation

no code implementations21 Mar 2024 Yang Yao, Xin Wang, Zeyang Zhang, Yijian Qin, Ziwei Zhang, Xu Chu, Yuekui Yang, Wenwu Zhu, Hong Mei

In this paper, we propose LLM4GraphGen to explore the ability of LLMs for graph generation with systematical task designs and extensive experiments.

Drug Discovery Graph Generation +1

Hot or Cold? Adaptive Temperature Sampling for Code Generation with Large Language Models

1 code implementation6 Sep 2023 Yuqi Zhu, Ge Li, YunFei Zhao, Jia Li, Zhi Jin, Hong Mei

With an analysis of loss distributions of code tokens, we find that code tokens can be divided into two categories: challenging tokens that are difficult to predict and confident tokens that can be easily inferred.

Code Generation

DeepVisualInsight: Time-Travelling Visualization for Spatio-Temporal Causality of Deep Classification Training

no code implementations31 Dec 2021 Xianglin Yang, Yun Lin, Ruofan Liu, Zhenfeng He, Chao Wang, Jin Song Dong, Hong Mei

Moreover, our case study shows that our visual solution can well reflect the characteristics of various training scenarios, showing good potential of DVI as a debugging tool for analyzing deep learning training processes.

Active Learning Deep Learning

Massive Self-Assembly in Grid Environments

no code implementations5 Feb 2021 Wenjie Chu, Wei zhang, Haiyan Zhao, Zhi Jin, Hong Mei

Self-assembly plays an essential role in many natural processes, involving the formation and evolution of living or non-living structures, and shows potential applications in many emerging domains.

Multiagent Systems Distributed, Parallel, and Cluster Computing Robotics

Cannot find the paper you are looking for? You can Submit a new open access paper.