Search Results for author: Yi-Chang Chen

Found 12 papers, 8 papers with code

BreezyVoice: Adapting TTS for Taiwanese Mandarin with Enhanced Polyphone Disambiguation -- Challenges and Insights

no code implementations29 Jan 2025 Chan-Jan Hsu, Yi-Cheng Lin, Chia-Chun Lin, Wei-Chih Chen, Ho Lam Chung, Chen-An Li, Yi-Chang Chen, Chien-Yu Yu, Ming-Ji Lee, Chien-Cheng Chen, Ru-Heng Huang, Hung-Yi Lee, Da-Shan Shiu

We present BreezyVoice, a Text-to-Speech (TTS) system specifically adapted for Taiwanese Mandarin, highlighting phonetic control abilities to address the unique challenges of polyphone disambiguation in the language.

Language Modeling Language Modelling +3

Enhancing Function-Calling Capabilities in LLMs: Strategies for Prompt Formats, Data Integration, and Multilingual Translation

no code implementations2 Dec 2024 Yi-Chang Chen, Po-chun Hsu, Chan-Jan Hsu, Da-Shan Shiu

This research delves into enhancing the function-calling capabilities of LLMs by exploring different approaches, including prompt formats for integrating function descriptions, blending function-calling and instruction-following data, introducing a novel Decision Token for conditional prompts, leveraging chain-of-thought reasoning, and overcoming multilingual challenges with a translation pipeline.

Data Integration Instruction Following +2

Let's Fuse Step by Step: A Generative Fusion Decoding Algorithm with LLMs for Multi-modal Text Recognition

1 code implementation23 May 2024 Chan-Jan Hsu, Yi-Chang Chen, Feng-Ting Liao, Pei-Chen Ho, Yu-Hsiang Wang, Po-chun Hsu, Da-Shan Shiu

We introduce "Generative Fusion Decoding" (GFD), a novel shallow fusion framework, utilized to integrate Large Language Models (LLMs) into multi-modal text recognition systems such as automatic speech recognition (ASR) and optical character recognition (OCR).

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Breeze-7B Technical Report

no code implementations5 Mar 2024 Chan-Jan Hsu, Chang-Le Liu, Feng-Ting Liao, Po-chun Hsu, Yi-Chang Chen, Da-Shan Shiu

Breeze-7B is an open-source language model based on Mistral-7B, designed to address the need for improved language comprehension and chatbot-oriented capabilities in Traditional Chinese.

Chatbot Language Modeling +1

Advancing the Evaluation of Traditional Chinese Language Models: Towards a Comprehensive Benchmark Suite

1 code implementation15 Sep 2023 Chan-Jan Hsu, Chang-Le Liu, Feng-Ting Liao, Po-chun Hsu, Yi-Chang Chen, Da-Shan Shiu

In an effort to advance the evaluation of language models in Traditional Chinese and stimulate further research in this field, we have open-sourced our benchmark and opened the model for trial.

Question Answering

Zero-shot Domain-sensitive Speech Recognition with Prompt-conditioning Fine-tuning

1 code implementation18 Jul 2023 Feng-Ting Liao, Yung-Chieh Chan, Yi-Chang Chen, Chan-Jan Hsu, Da-Shan Shiu

In this work, we propose a method to create domain-sensitive speech recognition models that utilize textual domain information by conditioning its generation on a given text prompt.

Domain Adaptation speech-recognition +1

Integrated Semantic and Phonetic Post-correction for Chinese Speech Recognition

1 code implementation ROCLING 2021 Yi-Chang Chen, Chun-Yen Cheng, Chien-An Chen, Ming-Chieh Sung, Yi-Ren Yeh

Due to the recent advances of natural language processing, several works have applied the pre-trained masked language model (MLM) of BERT to the post-correction of speech recognition.

Language Modeling Language Modelling +2

Cannot find the paper you are looking for? You can Submit a new open access paper.