Search Results for author: Siyin Wang

Found 15 papers, 3 papers with code

QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions

no code implementations26 Mar 2025 Siyin Wang, Wenyi Yu, Xianzhao Chen, Xiaohai Tian, Jun Zhang, Lu Lu, Yu Tsao, Junichi Yamagishi, Yuxuan Wang, Chao Zhang

To bridge this gap, we introduce QualiSpeech, a comprehensive low-level speech quality assessment dataset encompassing 11 key aspects and detailed natural language comments that include reasoning and contextual insights.

World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning

no code implementations13 Mar 2025 Siyin Wang, Zhaoye Fei, Qinyuan Cheng, Shiduo Zhang, Panpan Cai, Jinlan Fu, Xipeng Qiu

Recent advances in large vision-language models (LVLMs) have shown promise for embodied task planning, yet they struggle with fundamental challenges like dependency constraints and efficiency.

Task Planning

Audio Large Language Models Can Be Descriptive Speech Quality Evaluators

no code implementations27 Jan 2025 Chen Chen, Yuchen Hu, Siyin Wang, Helin Wang, Zhehuai Chen, Chao Zhang, Chao-Han Huck Yang, Eng Siong Chng

Recent advances have enabled large language models (LLMs) to incorporate auditory systems for handling various speech-related tasks.

Descriptive

SALMONN-omni: A Codec-free LLM for Full-duplex Speech Understanding and Generation

no code implementations27 Nov 2024 Wenyi Yu, Siyin Wang, Xiaoyu Yang, Xianzhao Chen, Xiaohai Tian, Jun Zhang, Guangzhi Sun, Lu Lu, Yuxuan Wang, Chao Zhang

Unlike traditional modularised conversational AI systems, which separate speech recognition, understanding, and text-to-speech generation into distinct components, multimodal LLMs operate as single end-to-end models.

Question Answering Speech Enhancement +3

Enabling Auditory Large Language Models for Automatic Speech Quality Evaluation

1 code implementation25 Sep 2024 Siyin Wang, Wenyi Yu, Yudong Yang, Changli Tang, Yixuan Li, Jimin Zhuang, Xianzhao Chen, Xiaohai Tian, Jun Zhang, Guangzhi Sun, Lu Lu, Yuxuan Wang, Chao Zhang

The results demonstrate that auditory LLMs achieve competitive performance compared to state-of-the-art task-specific small models in predicting MOS and SIM, while also delivering promising results in A/B testing and natural language descriptions.

Text to Speech

Robust Zero-Shot Text-to-Speech Synthesis with Reverse Inference Optimization

no code implementations2 Jul 2024 Yuchen Hu, Chen Chen, Siyin Wang, Eng Siong Chng, Chao Zhang

By leveraging reverse inference as the standard to select exemplars used in RLHF from the speech samples generated by the TTS system itself, RIO steers the subsequent optimization towards a direction of enhancing the TTS robustness.

Inference Optimization Speech Synthesis +2

Cross-Modality Safety Alignment

1 code implementation21 Jun 2024 Siyin Wang, Xingsong Ye, Qinyuan Cheng, Junwen Duan, ShiMin Li, Jinlan Fu, Xipeng Qiu, Xuanjing Huang

As Artificial General Intelligence (AGI) becomes increasingly integrated into various facets of human life, ensuring the safety and ethical alignment of such systems is paramount.

Safety Alignment

Bayesian Example Selection Improves In-Context Learning for Speech, Text, and Visual Modalities

no code implementations23 Apr 2024 Siyin Wang, Chao-Han Huck Yang, Ji Wu, Chao Zhang

Large language models (LLMs) can adapt to new tasks through in-context learning (ICL) based on a few examples presented in dialogue history without any model parameter update.

In-Context Learning

In-Memory Learning: A Declarative Learning Framework for Large Language Models

no code implementations5 Mar 2024 Bo wang, Tianxiang Sun, Hang Yan, Siyin Wang, Qingyuan Cheng, Xipeng Qiu

The exploration of whether agents can align with their environment without relying on human-labeled data presents an intriguing research topic.

Domain Generalization via Causal Adjustment for Cross-Domain Sentiment Analysis

no code implementations22 Feb 2024 Siyin Wang, Jie zhou, Qin Chen, Qi Zhang, Tao Gui, Xuanjing Huang

Domain adaption has been widely adapted for cross-domain sentiment analysis to transfer knowledge from the source domain to the target domain.

Domain Generalization Sentiment Analysis

LLM can Achieve Self-Regulation via Hyperparameter Aware Generation

no code implementations17 Feb 2024 Siyin Wang, ShiMin Li, Tianxiang Sun, Jinlan Fu, Qinyuan Cheng, Jiasheng Ye, Junjie Ye, Xipeng Qiu, Xuanjing Huang

HAG extends the current paradigm in the text generation process, highlighting the feasibility of endowing the LLMs with self-regulate decoding strategies.

Text Generation

A Soft Contrastive Learning-based Prompt Model for Few-shot Sentiment Analysis

no code implementations16 Dec 2023 Jingyi Zhou, Jie zhou, Jiabao Zhao, Siyin Wang, Haijun Shan, Gui Tao, Qi Zhang, Xuanjing Huang

Few-shot text classification has attracted great interest in both academia and industry due to the lack of labeled data in many fields.

Contrastive Learning Few-Shot Text Classification +4

Can Whisper perform speech-based in-context learning?

no code implementations13 Sep 2023 Siyin Wang, Chao-Han Huck Yang, Ji Wu, Chao Zhang

Language-level adaptation experiments using Chinese dialects showed that when applying SICL to isolated word ASR, consistent and considerable relative WER reductions can be achieved using Whisper models of any size on two dialects, which is on average 32. 3%.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Causal Intervention Improves Implicit Sentiment Analysis

no code implementations COLING 2022 Siyin Wang, Jie zhou, Changzhi Sun, Junjie Ye, Tao Gui, Qi Zhang, Xuanjing Huang

In this work, we propose a causal intervention model for Implicit Sentiment Analysis using Instrumental Variable (ISAIV).

Sentence Sentiment Analysis

Cannot find the paper you are looking for? You can Submit a new open access paper.