no code implementations • 18 Feb 2025 • Xiaoqian Liu, Ke Wang, Yongbin Li, Yuchuan Wu, Wentao Ma, Aobo Kong, Fei Huang, Jianbin Jiao, Junge Zhang
Large Language Models (LLMs) have shown impressive reasoning capabilities in well-defined problems with clear solutions, such as mathematics and coding.
1 code implementation • 8 Jan 2025 • Run Luo, Ting-En Lin, Haonan Zhang, Yuchuan Wu, Xiong Liu, Min Yang, Yongbin Li, Longze Chen, Jiaming Li, Lei Zhang, Yangyi Chen, Hamid Alinejad-Rokny, Fei Huang
In the alignment phase, a pre-trained speech model is further trained on text-image tasks to generalize from vision to speech in a (near) zero-shot manner, outperforming models trained on tri-modal datasets.
1 code implementation • 3 Jan 2025 • Aobo Kong, Wentao Ma, Shiwan Zhao, Yongbin Li, Yuchuan Wu, Ke Wang, Xiaoqian Liu, Qicheng Li, Yong Qin, Fei Huang
To address these limitations, we propose Segment-Level Direct Preference Optimization (SDPO), which focuses on specific key segments within interactions to optimize multi-turn agent behavior while minimizing training noise.
no code implementations • 9 Sep 2024 • Run Luo, Haonan Zhang, Longze Chen, Ting-En Lin, Xiong Liu, Yuchuan Wu, Min Yang, Minzheng Wang, Pengpeng Zeng, Lianli Gao, Heng Tao Shen, Yunshui Li, Xiaobo Xia, Fei Huang, Jingkuan Song, Yongbin Li
This framework iteratively improve data quality through a refined combination of fine-grained perception, cognitive reasoning, and interaction evolution, generating a more complex and diverse image-text instruction dataset that empowers MLLMs with enhanced capabilities.
no code implementations • 21 Jun 2024 • Ruixuan Xiao, Wentao Ma, Ke Wang, Yuchuan Wu, Junbo Zhao, Haobo Wang, Fei Huang, Yongbin Li
Motivated by this, we formalize different formats of workflow knowledge and present FlowBench, the first benchmark for workflow-guided planning.
1 code implementation • 22 Apr 2024 • Zhengwei Tao, Ting-En Lin, Xiancai Chen, Hangyu Li, Yuchuan Wu, Yongbin Li, Zhi Jin, Fei Huang, DaCheng Tao, Jingren Zhou
To address this issue, self-evolution approaches that enable LLM to autonomously acquire, refine, and learn from experiences generated by the model itself are rapidly growing.
1 code implementation • 29 Mar 2024 • Qinhao Zhou, Zihan Zhang, Xiang Xiang, Ke Wang, Yuchuan Wu, Yongbin Li
As intelligent agents, LLMs need to have the capabilities of task planning, long-term memory, and the ability to leverage external tools to achieve satisfactory performance.
1 code implementation • CVPR 2024 • Yuwen Tan, Qinhao Zhou, Xiang Xiang, Ke Wang, Yuchuan Wu, Yongbin Li
We observe that adapter tuning demonstrates superiority over prompt-based methods, even without parameter expansion in each learning session.
1 code implementation • 4 Mar 2024 • Changyu Chen, Xiting Wang, Ting-En Lin, Ang Lv, Yuchuan Wu, Xin Gao, Ji-Rong Wen, Rui Yan, Yongbin Li
Furthermore, it is complementary to existing methods.
1 code implementation • 7 Dec 2023 • Yuhan Chen, Ang Lv, Ting-En Lin, Changyu Chen, Yuchuan Wu, Fei Huang, Yongbin Li, Rui Yan
Specifically, the crucial information in the context will be potentially overlooked by model when it is positioned in the trough zone of the attention waveform, leading to decreased performance.
Ranked #2 on
Trajectory Planning
on ToolBench
no code implementations • 30 Oct 2023 • Huawen Feng, Yan Fan, Xiong Liu, Ting-En Lin, Zekun Yao, Yuchuan Wu, Fei Huang, Yongbin Li, Qianli Ma
Despite the recent progress in text summarization made by large language models (LLMs), they often generate summaries that are factually inconsistent with original articles, known as "hallucinations" in text generation.
no code implementations • 10 Oct 2023 • Tianshu Yu, Ting-En Lin, Yuchuan Wu, Min Yang, Fei Huang, Yongbin Li
This limitation leads to suboptimal performance, even when ample training data is available.
no code implementations • 22 Sep 2023 • Haoyu Gao, Ting-En Lin, Hangyu Li, Min Yang, Yuchuan Wu, Wentao Ma, Yongbin Li
Task-oriented dialogue (TOD) systems facilitate users in executing various activities via multi-turn dialogues, but Large Language Models (LLMs) often struggle to comprehend these intricate contexts.
no code implementations • 20 Sep 2023 • Yucheng Cai, Wentao Ma, Yuchuan Wu, Shuzheng Si, Yuan Shao, Zhijian Ou, Yongbin Li
Using the high-quality prompts generated, we scale the corpus of the pre-trained conversation model to 122 datasets from 15 dialog-related tasks, resulting in Universal Pre-trained Conversation Model (UniPCM), a powerful foundation model for various conversational tasks and different dialog systems.
2 code implementations • 4 Sep 2023 • Zaijing Li, Ting-En Lin, Yuchuan Wu, Meng Liu, Fengxiao Tang, Ming Zhao, Yongbin Li
Sentiment analysis is a crucial task that aims to understand people's emotional states and predict emotional categories based on multimodal information.
Aspect-Based Sentiment Analysis
Aspect-Based Sentiment Analysis (ABSA)
+2
1 code implementation • NeurIPS 2023 • Shuzheng Si, Wentao Ma, Haoyu Gao, Yuchuan Wu, Ting-En Lin, Yinpei Dai, Hangyu Li, Rui Yan, Fei Huang, Yongbin Li
SpokenWOZ further incorporates common spoken characteristics such as word-by-word processing and reasoning in spoken language.
1 code implementation • 19 May 2023 • Tianshu Yu, Haoyu Gao, Ting-En Lin, Min Yang, Yuchuan Wu, Wentao Ma, Chao Wang, Fei Huang, Yongbin Li
In this paper, we propose Speech-text dialog Pre-training for spoken dialog understanding with ExpliCiT cRoss-Modal Alignment (SPECTRA), which is the first-ever speech-text dialog pre-training model.
Ranked #2 on
Multimodal Sentiment Analysis
on CMU-MOSI
(Acc-2 metric, using extra
training data)
cross-modal alignment
Emotion Recognition in Conversation
+2
1 code implementation • 4 May 2023 • Haoyu Gao, Rui Wang, Ting-En Lin, Yuchuan Wu, Min Yang, Fei Huang, Yongbin Li
Dialogue Topic Segmentation (DTS) plays an essential role in a variety of dialogue modeling tasks.
no code implementations • 23 Feb 2023 • Yushan Qian, Bo wang, Ting-En Lin, Yinhe Zheng, Ying Zhu, Dongming Zhao, Yuexian Hou, Yuchuan Wu, Yongbin Li
Empathetic dialogue is a human-like behavior that requires the perception of both affective factors (e. g., emotion status) and cognitive factors (e. g., cause of the emotion).
1 code implementation • 21 Nov 2022 • Guimin Hu, Ting-En Lin, Yi Zhao, Guangming Lu, Yuchuan Wu, Yongbin Li
Multimodal sentiment analysis (MSA) and emotion recognition in conversation (ERC) are key research topics for computers to understand human behaviors.
Ranked #2 on
Multimodal Sentiment Analysis
on CMU-MOSI
no code implementations • 21 Nov 2022 • Yinpei Dai, Wanwei He, Bowen Li, Yuchuan Wu, Zheng Cao, Zhongqi An, Jian Sun, Yongbin Li
Practical dialog systems need to deal with various knowledge sources, noisy user expressions, and the shortage of annotated data.
no code implementations • 30 May 2022 • Ting-En Lin, Yuchuan Wu, Fei Huang, Luo Si, Jian Sun, Yongbin Li
In this paper, we present Duplex Conversation, a multi-turn, multimodal spoken dialogue system that enables telephone-based agents to interact with customers like a human.
1 code implementation • CVPR 2024 • Jing Ma, Xiang Xiang, Ke Wang, Yuchuan Wu, Yongbin Li
Black-Box Knowledge Distillation (B2KD) is a formulated problem for cloud-to-edge model compression with invisible data and models hosted on the server.
1 code implementation • Findings (ACL) 2022 • Sai Zhang, Yuwei Hu, Yuchuan Wu, Jiaman Wu, Yongbin Li, Jian Sun, Caixia Yuan, Xiaojie Wang
We find some new linguistic phenomena and interactive manners in SSTOD which raise critical challenges of building dialog agents for the task.
Ranked #1 on
SSTOD
on SSD_NAME
1 code implementation • 29 Nov 2021 • Wanwei He, Yinpei Dai, Yinhe Zheng, Yuchuan Wu, Zheng Cao, Dermot Liu, Peng Jiang, Min Yang, Fei Huang, Luo Si, Jian Sun, Yongbin Li
Pre-trained models have proved to be powerful in enhancing task-oriented dialog systems.
Ranked #1 on
End-To-End Dialogue Modelling
on MULTIWOZ 2.0