Search Results for author: Yuchuan Wu

Found 25 papers, 15 papers with code

EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning

no code implementations18 Feb 2025 Xiaoqian Liu, Ke Wang, Yongbin Li, Yuchuan Wu, Wentao Ma, Aobo Kong, Fei Huang, Jianbin Jiao, Junge Zhang

Large Language Models (LLMs) have shown impressive reasoning capabilities in well-defined problems with clear solutions, such as mathematics and coding.

Navigate Reinforcement Learning (RL)

OpenOmni: Large Language Models Pivot Zero-shot Omnimodal Alignment across Language with Real-time Self-Aware Emotional Speech Synthesis

1 code implementation8 Jan 2025 Run Luo, Ting-En Lin, Haonan Zhang, Yuchuan Wu, Xiong Liu, Min Yang, Yongbin Li, Longze Chen, Jiaming Li, Lei Zhang, Yangyi Chen, Hamid Alinejad-Rokny, Fei Huang

In the alignment phase, a pre-trained speech model is further trained on text-image tasks to generalize from vision to speech in a (near) zero-shot manner, outperforming models trained on tri-modal datasets.

Decoder Emotional Speech Synthesis +3

SDPO: Segment-Level Direct Preference Optimization for Social Agents

1 code implementation3 Jan 2025 Aobo Kong, Wentao Ma, Shiwan Zhao, Yongbin Li, Yuchuan Wu, Ke Wang, Xiaoqian Liu, Qicheng Li, Yong Qin, Fei Huang

To address these limitations, we propose Segment-Level Direct Preference Optimization (SDPO), which focuses on specific key segments within interactions to optimize multi-turn agent behavior while minimizing training noise.

MMEvol: Empowering Multimodal Large Language Models with Evol-Instruct

no code implementations9 Sep 2024 Run Luo, Haonan Zhang, Longze Chen, Ting-En Lin, Xiong Liu, Yuchuan Wu, Min Yang, Minzheng Wang, Pengpeng Zeng, Lianli Gao, Heng Tao Shen, Yunshui Li, Xiaobo Xia, Fei Huang, Jingkuan Song, Yongbin Li

This framework iteratively improve data quality through a refined combination of fine-grained perception, cognitive reasoning, and interaction evolution, generating a more complex and diverse image-text instruction dataset that empowers MLLMs with enhanced capabilities.

Diversity Visual Reasoning

FlowBench: Revisiting and Benchmarking Workflow-Guided Planning for LLM-based Agents

no code implementations21 Jun 2024 Ruixuan Xiao, Wentao Ma, Ke Wang, Yuchuan Wu, Junbo Zhao, Haobo Wang, Fei Huang, Yongbin Li

Motivated by this, we formalize different formats of workflow knowledge and present FlowBench, the first benchmark for workflow-guided planning.

Benchmarking

A Survey on Self-Evolution of Large Language Models

1 code implementation22 Apr 2024 Zhengwei Tao, Ting-En Lin, Xiancai Chen, Hangyu Li, Yuchuan Wu, Yongbin Li, Zhi Jin, Fei Huang, DaCheng Tao, Jingren Zhou

To address this issue, self-evolution approaches that enable LLM to autonomously acquire, refine, and learn from experiences generated by the model itself are rapidly growing.

Diversity Survey

Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning

1 code implementation29 Mar 2024 Qinhao Zhou, Zihan Zhang, Xiang Xiang, Ke Wang, Yuchuan Wu, Yongbin Li

As intelligent agents, LLMs need to have the capabilities of task planning, long-term memory, and the ability to leverage external tools to achieve satisfactory performance.

Hallucination Task Planning

Semantically-Shifted Incremental Adapter-Tuning is A Continual ViTransformer

1 code implementation CVPR 2024 Yuwen Tan, Qinhao Zhou, Xiang Xiang, Ke Wang, Yuchuan Wu, Yongbin Li

We observe that adapter tuning demonstrates superiority over prompt-based methods, even without parameter expansion in each learning session.

class-incremental learning Class Incremental Learning +1

Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use

1 code implementation7 Dec 2023 Yuhan Chen, Ang Lv, Ting-En Lin, Changyu Chen, Yuchuan Wu, Fei Huang, Yongbin Li, Rui Yan

Specifically, the crucial information in the context will be potentially overlooked by model when it is positioned in the trough zone of the attention waveform, leading to decreased performance.

RAG Trajectory Planning

Improving Factual Consistency of Text Summarization by Adversarially Decoupling Comprehension and Embellishment Abilities of LLMs

no code implementations30 Oct 2023 Huawen Feng, Yan Fan, Xiong Liu, Ting-En Lin, Zekun Yao, Yuchuan Wu, Fei Huang, Yongbin Li, Qianli Ma

Despite the recent progress in text summarization made by large language models (LLMs), they often generate summaries that are factually inconsistent with original articles, known as "hallucinations" in text generation.

Text Generation Text Summarization

Self-Explanation Prompting Improves Dialogue Understanding in Large Language Models

no code implementations22 Sep 2023 Haoyu Gao, Ting-En Lin, Hangyu Li, Min Yang, Yuchuan Wu, Wentao Ma, Yongbin Li

Task-oriented dialogue (TOD) systems facilitate users in executing various activities via multi-turn dialogues, but Large Language Models (LLMs) often struggle to comprehend these intricate contexts.

Dialogue Understanding

UniPCM: Universal Pre-trained Conversation Model with Task-aware Automatic Prompt

no code implementations20 Sep 2023 Yucheng Cai, Wentao Ma, Yuchuan Wu, Shuzheng Si, Yuan Shao, Zhijian Ou, Yongbin Li

Using the high-quality prompts generated, we scale the corpus of the pre-trained conversation model to 122 datasets from 15 dialog-related tasks, resulting in Universal Pre-trained Conversation Model (UniPCM), a powerful foundation model for various conversational tasks and different dialog systems.

UniSA: Unified Generative Framework for Sentiment Analysis

2 code implementations4 Sep 2023 Zaijing Li, Ting-En Lin, Yuchuan Wu, Meng Liu, Fengxiao Tang, Ming Zhao, Yongbin Li

Sentiment analysis is a crucial task that aims to understand people's emotional states and predict emotional categories based on multimodal information.

Aspect-Based Sentiment Analysis Aspect-Based Sentiment Analysis (ABSA) +2

SpokenWOZ: A Large-Scale Speech-Text Benchmark for Spoken Task-Oriented Dialogue Agents

1 code implementation NeurIPS 2023 Shuzheng Si, Wentao Ma, Haoyu Gao, Yuchuan Wu, Ting-En Lin, Yinpei Dai, Hangyu Li, Rui Yan, Fei Huang, Yongbin Li

SpokenWOZ further incorporates common spoken characteristics such as word-by-word processing and reasoning in spoken language.

Speech-Text Dialog Pre-training for Spoken Dialog Understanding with Explicit Cross-Modal Alignment

1 code implementation19 May 2023 Tianshu Yu, Haoyu Gao, Ting-En Lin, Min Yang, Yuchuan Wu, Wentao Ma, Chao Wang, Fei Huang, Yongbin Li

In this paper, we propose Speech-text dialog Pre-training for spoken dialog understanding with ExpliCiT cRoss-Modal Alignment (SPECTRA), which is the first-ever speech-text dialog pre-training model.

Ranked #2 on Multimodal Sentiment Analysis on CMU-MOSI (Acc-2 metric, using extra training data)

cross-modal alignment Emotion Recognition in Conversation +2

Empathetic Response Generation via Emotion Cause Transition Graph

no code implementations23 Feb 2023 Yushan Qian, Bo wang, Ting-En Lin, Yinhe Zheng, Ying Zhu, Dongming Zhao, Yuexian Hou, Yuchuan Wu, Yongbin Li

Empathetic dialogue is a human-like behavior that requires the perception of both affective factors (e. g., emotion status) and cognitive factors (e. g., cause of the emotion).

Decoder Empathetic Response Generation +1

UniMSE: Towards Unified Multimodal Sentiment Analysis and Emotion Recognition

1 code implementation21 Nov 2022 Guimin Hu, Ting-En Lin, Yi Zhao, Guangming Lu, Yuchuan Wu, Yongbin Li

Multimodal sentiment analysis (MSA) and emotion recognition in conversation (ERC) are key research topics for computers to understand human behaviors.

Contrastive Learning Emotion Recognition in Conversation +1

CGoDial: A Large-Scale Benchmark for Chinese Goal-oriented Dialog Evaluation

no code implementations21 Nov 2022 Yinpei Dai, Wanwei He, Bowen Li, Yuchuan Wu, Zheng Cao, Zhongqi An, Jian Sun, Yongbin Li

Practical dialog systems need to deal with various knowledge sources, noisy user expressions, and the shortage of annotated data.

Goal-Oriented Dialog Retrieval

Duplex Conversation: Towards Human-like Interaction in Spoken Dialogue Systems

no code implementations30 May 2022 Ting-En Lin, Yuchuan Wu, Fei Huang, Luo Si, Jian Sun, Yongbin Li

In this paper, we present Duplex Conversation, a multi-turn, multimodal spoken dialogue system that enables telephone-based agents to interact with customers like a human.

Data Augmentation Spoken Dialogue Systems

Aligning Logits Generatively for Principled Black-Box Knowledge Distillation

1 code implementation CVPR 2024 Jing Ma, Xiang Xiang, Ke Wang, Yuchuan Wu, Yongbin Li

Black-Box Knowledge Distillation (B2KD) is a formulated problem for cloud-to-edge model compression with invisible data and models hosted on the server.

Federated Learning Knowledge Distillation +1

A Slot Is Not Built in One Utterance: Spoken Language Dialogs with Sub-Slots

1 code implementation Findings (ACL) 2022 Sai Zhang, Yuwei Hu, Yuchuan Wu, Jiaman Wu, Yongbin Li, Jian Sun, Caixia Yuan, Xiaojie Wang

We find some new linguistic phenomena and interactive manners in SSTOD which raise critical challenges of building dialog agents for the task.

 Ranked #1 on SSTOD on SSD_NAME

SSTOD

Cannot find the paper you are looking for? You can Submit a new open access paper.