no code implementations • CCL 2020 • Lin Wang, Juntao Li, Rui Yan, Dongyan Zhao
Story generation is a challenging task of automatically creating natural languages to describe a sequence of events, which requires outputting text with not only a consistent topic but also novel wordings.
no code implementations • 9 Nov 2024 • Minghan Li, Eric Gaussier, Juntao Li, Guodong Zhou
Comprehensive experiments on long-document datasets, including TREC 2019 DL, Robust04, and MLDR-zh, show that KeyB2 outperforms baselines like RankLLaMA and KeyB by reducing reranking time and GPU memory usage while enhancing retrieval performance, achieving new SOTA results on TREC 2019 DL with higher NDCG@10 and MAP scores.
1 code implementation • 24 Oct 2024 • Zecheng Tang, Zechen Sun, Juntao Li, Qiaoming Zhu, Min Zhang
To overcome the GPU memory-bound issue caused by the long sequence, LOGO employs a reference-free preference optimization strategy and adopts a position synthesis method to construct the training data.
1 code implementation • 24 Oct 2024 • Yuyang Ding, Xinyu Shi, Xiaobo Liang, Juntao Li, Qiaoming Zhu, Min Zhang
The availability of high-quality data is one of the most important factors in improving the reasoning capability of LLMs.
no code implementations • 23 Oct 2024 • Yixin Ji, Yang Xiang, Juntao Li, Qingrong Xia, Ping Li, Xinyu Duan, Zhefeng Wang, Min Zhang
As large language models (LLMs) are widely applied across various fields, model compression has become increasingly crucial for reducing costs and improving inference efficiency.
1 code implementation • 21 Oct 2024 • Wangjie You, Zecheng Tang, Juntao Li, Lili Yao, Min Zhang
Large language models (LLMs) have advanced significantly due to the attention mechanism, but their quadratic complexity and linear memory demands limit their performance on long-context tasks.
1 code implementation • 3 Oct 2024 • Zecheng Tang, Keyan Zhou, Juntao Li, Baibei Ji, Jianye Hou, Min Zhang
Long-context models (LCMs) have made remarkable strides in recent years, offering users great convenience for handling tasks that involve long context, such as document summarization.
1 code implementation • 30 Aug 2024 • Weijie Liu, Zecheng Tang, Juntao Li, Kehai Chen, Min Zhang
This work introduces MemLong: Memory-Augmented Retrieval for Long Text Generation, a method designed to enhance the capabilities of long-context language modeling by utilizing an external retriever for historical information retrieval.
1 code implementation • 22 Aug 2024 • Zhaochen Su, Jun Zhang, Xiaoye Qu, Tong Zhu, Yanshu Li, Jiashuo Sun, Juntao Li, Min Zhang, Yu Cheng
Only a few research explored the conflicts between the inherent knowledge of LLMs and the retrieved contextual knowledge.
1 code implementation • 25 Jun 2024 • Jikai Wang, Yi Su, Juntao Li, Qingrong Xia, Zi Ye, Xinyu Duan, Zhefeng Wang, Min Zhang
It searches the optimal tree structure that maximizes the mathematical expectation of the acceptance length in each decoding step.
1 code implementation • 20 Jun 2024 • Zhaochen Su, Jun Zhang, Tong Zhu, Xiaoye Qu, Juntao Li, Min Zhang, Yu Cheng
Therefore, we propose a crucial question: Can we build a universal framework to handle a variety of temporal reasoning tasks?
no code implementations • 17 Jun 2024 • Ruili Jiang, Kehai Chen, Xuefeng Bai, Zhixuan He, Juntao Li, Muyun Yang, Tiejun Zhao, Liqiang Nie, Min Zhang
In this survey, we review the progress in exploring human preference learning for LLMs from a preference-centered perspective, covering the sources and formats of preference feedback, the modeling and usage of preference signals, as well as the evaluation of the aligned LLMs.
1 code implementation • 13 Jun 2024 • Zhaochen Su, Juntao Li, Jun Zhang, Tong Zhu, Xiaoye Qu, Pan Zhou, Yan Bowen, Yu Cheng, Min Zhang
Temporal reasoning is fundamental for large language models (LLMs) to comprehend the world.
1 code implementation • 3 Jun 2024 • Yi Su, Yunpeng Tai, Yixin Ji, Juntao Li, Bowen Yan, Min Zhang
Large Language Models (LLMs) have demonstrated an impressive capability known as In-context Learning (ICL), which enables them to acquire knowledge from textual demonstrations without the need for parameter updates.
1 code implementation • 20 May 2024 • Xiaobo Liang, Haoke Zhang, Helan Hu, Juntao Li, Jun Xu, Min Zhang
The rapid advancement of large language models has given rise to a plethora of applications across a myriad of real-world tasks, mainly centered on aligning with human intent.
1 code implementation • 17 May 2024 • Yixin Ji, Yang Xiang, Juntao Li, Wei Chen, Zhongyi Liu, Kehai Chen, Min Zhang
To address the challenges of low-rank compression in LLMs, we conduct empirical research on the low-rank characteristics of large models.
1 code implementation • 9 May 2024 • Dan Qiao, Yi Su, Pinzheng Wang, Jing Ye, Wenjing Xie, Yuechi Zhou, Yuyang Ding, Zecheng Tang, Jikai Wang, Yixin Ji, Yue Wang, Pei Guo, Zechen Sun, Zikang Zhang, Juntao Li, Pingfu Chao, Wenliang Chen, Guohong Fu, Guodong Zhou, Qiaoming Zhu, Min Zhang
Large Language Models (LLMs) have played an important role in many fields due to their powerful capabilities. However, their massive number of parameters leads to high deployment requirements and incurs significant inference costs, which impedes their practical applications.
2 code implementations • 26 Feb 2024 • Yuyang Ding, Juntao Li, Pinzheng Wang, Zecheng Tang, Bowen Yan, Min Zhang
In the Named Entity Recognition (NER) task, recent advancements have seen the remarkable improvement of LLMs in a broad range of entity domains via instruction tuning, by adopting entity-centric schema.
no code implementations • 30 Jan 2024 • Zecheng Tang, Chenfei Wu, Zekai Zhang, Mingheng Ni, Shengming Yin, Yu Liu, Zhengyuan Yang, Lijuan Wang, Zicheng Liu, Juntao Li, Nan Duan
To leverage LLMs for visual synthesis, traditional methods convert raster image information into discrete grid tokens through specialized visual modules, while disrupting the model's ability to capture the true semantic representation of visual scenes.
no code implementations • 16 Dec 2023 • Xueying Du, Mingwei Liu, Juntao Li, Hanlin Wang, Xin Peng, Yiling Lou
Evaluating IntDiagSolver on multiple LLMs reveals consistent enhancement in the accuracy of crash bug resolution, including ChatGPT, Claude, and CodeLlama.
1 code implementation • 20 Nov 2023 • Lei Geng, Xu Yan, Ziqiang Cao, Juntao Li, Wenjie Li, Sujian Li, Xinjie Zhou, Yang Yang, Jun Zhang
We achieve a biomedical multilingual corpus by incorporating three granularity knowledge alignments (entity, fact, and passage levels) into monolingual corpora.
1 code implementation • 20 Oct 2023 • Zecheng Tang, Kaifeng Qi, Juntao Li, Min Zhang
By leveraging the augmenting data from the GEC models themselves in the post-training process and introducing regularization data for cycle training, our proposed method can effectively improve the model robustness of well-trained GEC models with only a few more training epochs as an extra cost.
1 code implementation • 16 Oct 2023 • Haoke Zhang, Yue Wang, Juntao Li, Xiabing Zhou, Min Zhang
Large Language Models~(LLMs) have demonstrated incredible capabilities in understanding, generating, and manipulating languages.
1 code implementation • 19 Sep 2023 • Juntao Li, Zecheng Tang, Yuyang Ding, Pinzheng Wang, Pei Guo, Wangjie You, Dan Qiao, Wenliang Chen, Guohong Fu, Qiaoming Zhu, Guodong Zhou, Min Zhang
This report provides the main details to pre-train an analogous model, including pre-training data processing, Bilingual Flan data collection, the empirical observations that inspire our model architecture design, training objectives of different stages, and other enhancement techniques.
1 code implementation • 18 Sep 2023 • Zecheng Tang, Chenfei Wu, Juntao Li, Nan Duan
Graphic layout generation, a growing research field, plays a significant role in user engagement and information perception.
no code implementations • 24 Aug 2023 • Yue Wang, Xinrui Wang, Juntao Li, Jinxiong Chang, Qishen Zhang, Zhongyi Liu, Guannan Zhang, Min Zhang
Instruction tuning is instrumental in enabling Large Language Models~(LLMs) to follow user instructions to complete various open-domain tasks.
1 code implementation • 19 Aug 2023 • Dan Qiao, Chenfei Wu, Yaobo Liang, Juntao Li, Nan Duan
In this paper, we propose GameEval, a novel approach to evaluating LLMs through goal-driven conversational games, overcoming the limitations of previous methods.
3 code implementations • 16 Aug 2023 • Zecheng Tang, Keyan Zhou, Juntao Li, Yuyang Ding, Pinzheng Wang, Bowen Yan, Rejie Hua, Min Zhang
In view of this, we introduce a Context-aware Model self-Detoxification~(CMD) framework that pays attention to both the context and the detoxification process, i. e., first detoxifying the context and then making the language model generate along the safe context.
2 code implementations • NeurIPS 2023 • Tong Wu, Zhihao Fan, Xiao Liu, Yeyun Gong, Yelong Shen, Jian Jiao, Hai-Tao Zheng, Juntao Li, Zhongyu Wei, Jian Guo, Nan Duan, Weizhu Chen
Diffusion models have gained significant attention in the realm of image generation due to their exceptional performance.
1 code implementation • 8 May 2023 • Zecheng Tang, Pinzheng Wang, Keyan Zhou, Juntao Li, Ziqiang Cao, Min Zhang
Diffusion models have been successfully adapted to text generation tasks by mapping the discrete text into the continuous space.
no code implementations • 25 Apr 2023 • Yi Su, Yixin Ji, Juntao Li, Hai Ye, Min Zhang
Accordingly, in this paper, we propose perturbation consistency learning (PCL), a simple test-time adaptation method to promote the model to make stable predictions for samples with distribution shifts.
no code implementations • COLING 2022 • Ziming Huang, Zhuoxuan Jiang, Ke Wang, Juntao Li, Shanshan Feng, Xian-Ling Mao
Although most existing methods can fulfil this requirement, they can only model single-source dialog data and cannot effectively capture the underlying knowledge of relations among data and subtasks.
no code implementations • 14 Mar 2023 • Pei Guo, Yisheng Xiao, Juntao Li, Min Zhang
Non-autoregressive neural machine translation (NAT) models are proposed to accelerate the inference process while maintaining relatively high performance.
1 code implementation • 13 Mar 2023 • Yisheng Xiao, Ruiyang Xu, Lijun Wu, Juntao Li, Tao Qin, Yan-Tie Liu, Min Zhang
Experiments on \textbf{3} different tasks (neural machine translation, summarization, and code generation) with \textbf{15} datasets in total confirm that our proposed simple method achieves significant performance improvement over the strong CMLM model.
1 code implementation • 9 Feb 2023 • Hai Ye, Yuyang Ding, Juntao Li, Hwee Tou Ng
To answer this question, we evaluate test-time adaptation (TTA) to improve a model after deployment.
1 code implementation • 31 Oct 2022 • Zhaochen Su, Zecheng Tang, Xinyan Guan, Juntao Li, Lijun Wu, Min Zhang
Existing methods mainly perform continual training to mitigate such a misalignment.
1 code implementation • COLING 2022 • Dan Qiao, Chenchen Dai, Yuyang Ding, Juntao Li, Qiang Chen, Wenliang Chen, Min Zhang
The conventional success of textual classification relies on annotated data, and the new paradigm of pre-trained language models (PLMs) still requires a few labeled data for downstream tasks.
2 code implementations • 31 Jul 2022 • Peng Xia, Yuechi Zhou, Ziyan Zhang, Zecheng Tang, Juntao Li
In view of the poor robustness of existing Chinese grammatical error correction models on attack test sets and large model parameters, this paper uses the method of knowledge distillation to compress model parameters and improve the anti-attack ability of the model.
1 code implementation • 20 Apr 2022 • Yisheng Xiao, Lijun Wu, Junliang Guo, Juntao Li, Min Zhang, Tao Qin, Tie-Yan Liu
While NAR generation can significantly accelerate inference speed for machine translation, the speedup comes at the cost of sacrificed translation accuracy compared to its counterpart, autoregressive (AR) generation.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +12
no code implementations • 28 Mar 2022 • Min Cao, Shiping Li, Juntao Li, Liqiang Nie, Min Zhang
On top of this, the efficiency-focused study on the ITR system is introduced as the third perspective.
2 code implementations • 13 Dec 2021 • Chong Liu, Xiaoyang Liu, Rongqin Zheng, Lixin Zhang, Xiaobo Liang, Juntao Li, Lijun Wu, Min Zhang, Leyu Lin
State-of-the-art sequential recommendation models proposed very recently combine contrastive learning techniques for obtaining high-quality user representations.
no code implementations • 1 Oct 2021 • Chongyang Tao, Jiazhan Feng, Chang Liu, Juntao Li, Xiubo Geng, Daxin Jiang
For this task, the adoption of pre-trained language models (such as BERT) has led to remarkable progress in a number of benchmarks.
no code implementations • 29 Sep 2021 • Xiaobo Liang, Runze Mao, Lijun Wu, Juntao Li, Weiqing Liu, Qing Li, Min Zhang
The common approach of consistency training is performed on the data-level, which typically utilizes the data augmentation strategy (or adversarial training) to make the predictions from the augmented input and the original input to be consistent, so that the model is more robust and attains better generalization ability.
no code implementations • 29 Sep 2021 • Yue Wang, Lijun Wu, Xiaobo Liang, Juntao Li, Min Zhang
Starting from the resurgence of deep learning, language models (LMs) have never been so popular.
8 code implementations • NeurIPS 2021 • Xiaobo Liang, Lijun Wu, Juntao Li, Yue Wang, Qi Meng, Tao Qin, Wei Chen, Min Zhang, Tie-Yan Liu
Dropout is a powerful and widely used technique to regularize the training of deep neural networks.
Ranked #4 on Machine Translation on WMT2014 English-French
no code implementations • NAACL 2021 • Chongyang Tao, Shen Gao, Juntao Li, Yansong Feng, Dongyan Zhao, Rui Yan
Sequential information, a. k. a., orders, is assumed to be essential for processing a sequence with recurrent neural network or convolutional neural network based encoders.
no code implementations • 17 Mar 2021 • Juntao Li, Chang Liu, Chongyang Tao, Zhangming Chan, Dongyan Zhao, Min Zhang, Rui Yan
To fill the gap between these up-to-date methods and the real-world applications, we incorporate user-specific dialogue history into the response selection and propose a personalized hybrid matching network (PHMN).
no code implementations • 10 Mar 2021 • Mingfei Guo, Xiuying Chen, Juntao Li, Dongyan Zhao, Rui Yan
Automatically identifying fake news from the Internet is a challenging problem in deception detection tasks.
1 code implementation • 23 Nov 2020 • Juntao Li, Ruidan He, Hai Ye, Hwee Tou Ng, Lidong Bing, Rui Yan
Experimental results show that our proposed method achieves significant performance improvements over the state-of-the-art pretrained cross-lingual language model in the CLCD setting.
2 code implementations • EMNLP 2020 • Hai Ye, Qingyu Tan, Ruidan He, Juntao Li, Hwee Tou Ng, Lidong Bing
To improve the robustness of self-training, in this paper we present class-aware feature self-distillation (CFd) to learn discriminative features from PrLMs, in which PrLM features are self-distilled into a feature adaptation module and the features from the same class are more tightly clustered.
1 code implementation • 30 May 2020 • Liuyuan Chen, Kanglei Zhou, Junchang Jing, Haiju Fan, Juntao Li
Next, Lagrangian multipliers are proved to be 1 as the regularization parameter approaches infinity, thus, a simple yet effective initialization algorithm is devised.
1 code implementation • 17 May 2020 • Juntao Li, Chang Liu, Jian Wang, Lidong Bing, Hongsong Li, Xiaozhong Liu, Dongyan Zhao, Rui Yan
We manually collect a new and high-quality paired dataset, where each pair contains an unordered product attribute set in the source language and an informative product description in the target language.
no code implementations • IJCNLP 2019 • Zhangming Chan, Juntao Li, Xiaopeng Yang, Xiuying Chen, Wenpeng Hu, Dongyan Zhao, Rui Yan
In this work, we improve the WAE for response generation.
no code implementations • IJCNLP 2019 • Zhangming Chan, Xiuying Chen, Yongliang Wang, Juntao Li, Zhiqiang Zhang, Kun Gai, Dongyan Zhao, Rui Yan
Different from other text generation tasks, in product description generation, it is of vital importance to generate faithful descriptions that stick to the product attribute information.
no code implementations • ACL 2019 • Lisong Qiu, Juntao Li, Wei Bi, Dongyan Zhao, Rui Yan
Due to its potential applications, open-domain dialogue generation has become popular and achieved remarkable progress in recent years, but sometimes suffers from generic responses.
no code implementations • EMNLP 2018 • Juntao Li, Yan Song, Haisong Zhang, Dongmin Chen, Shuming Shi, Dongyan Zhao, Rui Yan
It is a challenging task to automatically compose poems with not only fluent expressions but also aesthetic wording.