1 code implementation • 20 Feb 2025 • Hanlin Wang, Jian Wang, Chak Tou Leong, Wenjie Li
To address this, we highlight the importance of timely calibration and the need to automatically construct calibration trajectories for training agents.
no code implementations • 19 Feb 2025 • Chak Tou Leong, Qingyu Yin, Jian Wang, Wenjie Li
The safety alignment of large language models (LLMs) remains vulnerable, as their initial behavior can be easily jailbroken by even relatively simple attacks.
1 code implementation • 17 Feb 2025 • Heming Xia, Yongqi Li, Chak Tou Leong, Wenjie Wang, Wenjie Li
Chain-of-Thought (CoT) has been proven effective in enhancing the reasoning capabilities of large language models (LLMs).
no code implementations • 11 Dec 2024 • Jiaqi Chen, Xiaoye Zhu, Tianyang Liu, Ying Chen, Xinhui Chen, Yiwen Yuan, Chak Tou Leong, Zuchao Li, Tang Long, Lei Zhang, Chenyu Yan, Guanghao Mei, Jie Zhang, Lefei Zhang
Large Language Models (LLMs) have revolutionized text generation, making detecting machine-generated text increasingly challenging.
no code implementations • 12 Nov 2024 • Qingyu Yin, Chak Tou Leong, Hongbo Zhang, Minjun Zhu, Hanqi Yan, Qiang Zhang, Yulan He, Wenjie Li, Jun Wang, Yue Zhang, Linyi Yang
The alignment of large language models (LLMs) with human preferences remains a key challenge.
no code implementations • 9 Oct 2024 • Kaishuai Xu, Tiezheng Yu, Wenjun Hou, Yi Cheng, Chak Tou Leong, Liangyou Li, Xin Jiang, Lifeng Shang, Qun Liu, Wenjie Li
In this work, we propose a novel preference learning framework called eRror-Injected Self-Editing (RISE), which injects predefined subtle errors into partial tokens of correct solutions to construct hard pairs for error mitigation.
1 code implementation • 7 Oct 2024 • Qingyu Yin, Xuzheng He, Luoao Deng, Chak Tou Leong, Fan Wang, Yanzhao Yan, Xiaoyu Shen, Qiang Zhang
Fine-tuning and in-context learning (ICL) are two prevalent methods in imbuing large language models with task-specific knowledge.
no code implementations • 5 Sep 2024 • Hanlin Wang, Chak Tou Leong, Jian Wang, Wenjie Li
Language models are exhibiting increasing capability in knowledge utilization and reasoning.
no code implementations • 20 Jun 2024 • Yi Cheng, Wenge Liu, Kaishuai Xu, Wenjun Hou, Yi Ouyang, Chak Tou Leong, Xian Wu, Yefeng Zheng
However, imbuing agents with autonomous adaptability presents unique challenges, including identifying optimal adaptations to meet users' expectations and ensuring a smooth transition during the adaptation process.
no code implementations • 25 May 2024 • Chak Tou Leong, Yi Cheng, Kaishuai Xu, Jian Wang, Hanlin Wang, Wenjie Li
In particular, we analyze the two most representative types of attack approaches: Explicit Harmful Attack (EHA) and Identity-Shifting Attack (ISA).
1 code implementation • 10 Feb 2024 • Jian Wang, Chak Tou Leong, Jiashuo Wang, Dongding Lin, Wenjie Li, Xiao-Yong Wei
Tuning language models for dialogue generation has been a prevalent paradigm for building capable dialogue agents.
1 code implementation • 11 Jan 2024 • Jiashuo Wang, Chunpu Xu, Chak Tou Leong, Wenjie Li, Jing Li
An emotional support conversation system aims to alleviate users' emotional distress and assist them in addressing their challenges.
1 code implementation • 19 Dec 2023 • Yi Cheng, Wenge Liu, Jian Wang, Chak Tou Leong, Yi Ouyang, Wenjie Li, Xian Wu, Yefeng Zheng
In recent years, there has been a growing interest in exploring dialogues with more complex goals, such as negotiation, persuasion, and emotional support, which go beyond traditional service-focused dialogue systems.
2 code implementations • 14 Oct 2023 • Chak Tou Leong, Yi Cheng, Jiashuo Wang, Jian Wang, Wenjie Li
Drawing on this idea, we devise a method to identify the toxification direction from the normal generation process to the one prompted with the negative prefix, and then steer the generation to the reversed direction by manipulating the information movement within the attention layers.
1 code implementation • 11 Oct 2023 • Jian Wang, Yi Cheng, Dongding Lin, Chak Tou Leong, Wenjie Li
Target-oriented dialogue systems, designed to proactively steer conversations toward predefined targets or accomplish specific system-side goals, are an exciting area in conversational AI.