no code implementations • 3 Oct 2024 • Zhipei Xu, Xuanyu Zhang, Runyi Li, Zecheng Tang, Qing Huang, Jian Zhang
The rapid development of generative AI is a double-edged sword, which not only facilitates content creation but also makes image manipulation easier and more difficult to detect.
1 code implementation • 3 Oct 2024 • Zecheng Tang, Keyan Zhou, Juntao Li, Baibei Ji, Jianye Hou, Min Zhang
Long-context models (LCMs) have made remarkable strides in recent years, offering users great convenience for handling tasks that involve long context, such as document summarization.
1 code implementation • 30 Aug 2024 • Weijie Liu, Zecheng Tang, Juntao Li, Kehai Chen, Min Zhang
This work introduces MemLong: Memory-Augmented Retrieval for Long Text Generation, a method designed to enhance the capabilities of long-context language modeling by utilizing an external retriever for historical information retrieval.
1 code implementation • 9 May 2024 • Dan Qiao, Yi Su, Pinzheng Wang, Jing Ye, Wenjing Xie, Yuechi Zhou, Yuyang Ding, Zecheng Tang, Jikai Wang, Yixin Ji, Yue Wang, Pei Guo, Zechen Sun, Zikang Zhang, Juntao Li, Pingfu Chao, Wenliang Chen, Guohong Fu, Guodong Zhou, Qiaoming Zhu, Min Zhang
Large Language Models (LLMs) have played an important role in many fields due to their powerful capabilities. However, their massive number of parameters leads to high deployment requirements and incurs significant inference costs, which impedes their practical applications.
2 code implementations • 26 Feb 2024 • Yuyang Ding, Juntao Li, Pinzheng Wang, Zecheng Tang, Bowen Yan, Min Zhang
In the Named Entity Recognition (NER) task, recent advancements have seen the remarkable improvement of LLMs in a broad range of entity domains via instruction tuning, by adopting entity-centric schema.
no code implementations • 30 Jan 2024 • Zecheng Tang, Chenfei Wu, Zekai Zhang, Mingheng Ni, Shengming Yin, Yu Liu, Zhengyuan Yang, Lijuan Wang, Zicheng Liu, Juntao Li, Nan Duan
To leverage LLMs for visual synthesis, traditional methods convert raster image information into discrete grid tokens through specialized visual modules, while disrupting the model's ability to capture the true semantic representation of visual scenes.
1 code implementation • 20 Oct 2023 • Zecheng Tang, Kaifeng Qi, Juntao Li, Min Zhang
By leveraging the augmenting data from the GEC models themselves in the post-training process and introducing regularization data for cycle training, our proposed method can effectively improve the model robustness of well-trained GEC models with only a few more training epochs as an extra cost.
1 code implementation • 19 Sep 2023 • Juntao Li, Zecheng Tang, Yuyang Ding, Pinzheng Wang, Pei Guo, Wangjie You, Dan Qiao, Wenliang Chen, Guohong Fu, Qiaoming Zhu, Guodong Zhou, Min Zhang
This report provides the main details to pre-train an analogous model, including pre-training data processing, Bilingual Flan data collection, the empirical observations that inspire our model architecture design, training objectives of different stages, and other enhancement techniques.
1 code implementation • 18 Sep 2023 • Zecheng Tang, Chenfei Wu, Juntao Li, Nan Duan
Graphic layout generation, a growing research field, plays a significant role in user engagement and information perception.
2 code implementations • 16 Aug 2023 • Zecheng Tang, Keyan Zhou, Juntao Li, Yuyang Ding, Pinzheng Wang, Bowen Yan, Min Zhang
In view of this, we introduce a Context-aware Model self-Detoxification~(CMD) framework that pays attention to both the context and the detoxification process, i. e., first detoxifying the context and then making the language model generate along the safe context.
1 code implementation • 8 May 2023 • Zecheng Tang, Pinzheng Wang, Keyan Zhou, Juntao Li, Ziqiang Cao, Min Zhang
Diffusion models have been successfully adapted to text generation tasks by mapping the discrete text into the continuous space.
3 code implementations • 8 Mar 2023 • Chenfei Wu, Shengming Yin, Weizhen Qi, Xiaodong Wang, Zecheng Tang, Nan Duan
To this end, We build a system called \textbf{Visual ChatGPT}, incorporating different Visual Foundation Models, to enable the user to interact with ChatGPT by 1) sending and receiving not only languages but also images 2) providing complex visual questions or visual editing instructions that require the collaboration of multiple AI models with multi-steps.
1 code implementation • 31 Oct 2022 • Zhaochen Su, Zecheng Tang, Xinyan Guan, Juntao Li, Lijun Wu, Min Zhang
Existing methods mainly perform continual training to mitigate such a misalignment.
2 code implementations • 31 Jul 2022 • Peng Xia, Yuechi Zhou, Ziyan Zhang, Zecheng Tang, Juntao Li
In view of the poor robustness of existing Chinese grammatical error correction models on attack test sets and large model parameters, this paper uses the method of knowledge distillation to compress model parameters and improve the anti-attack ability of the model.