Search Results for author: Po-chun Hsu

Found 16 papers, 10 papers with code

Breeze-7B Technical Report

no code implementations • 5 Mar 2024 • Chan-Jan Hsu, Chang-Le Liu, Feng-Ting Liao, Po-chun Hsu, Yi-Chang Chen, Da-Shan Shiu

Breeze-7B is an open-source language model based on Mistral-7B, designed to address the need for improved language comprehension and chatbot-oriented capabilities in Traditional Chinese.

Chatbot Language Modelling

Paper
Add Code

Low-Resource Self-Supervised Learning with SSL-Enhanced TTS

no code implementations • 29 Sep 2023 • Po-chun Hsu, Ali Elkahky, Wei-Ning Hsu, Yossi Adi, Tu Anh Nguyen, Jade Copet, Emmanuel Dupoux, Hung-Yi Lee, Abdelrahman Mohamed

Self-supervised learning (SSL) techniques have achieved remarkable results in various speech processing tasks.

Self-Supervised Learning

Paper
Add Code

Advancing the Evaluation of Traditional Chinese Language Models: Towards a Comprehensive Benchmark Suite

1 code implementation • 15 Sep 2023 • Chan-Jan Hsu, Chang-Le Liu, Feng-Ting Liao, Po-chun Hsu, Yi-Chang Chen, Da-Shan Shiu

In an effort to advance the evaluation of language models in Traditional Chinese and stimulate further research in this field, we have open-sourced our benchmark and opened the model for trial.

Question Answering

146

Paper
Code

Federated Deep Reinforcement Learning for THz-Beam Search with Limited CSI

no code implementations • 25 Apr 2023 • Po-chun Hsu, Li-Hsiang Shen, Chun-Hung Liu, Kai-Ten Feng

Terahertz (THz) communication with ultra-wide available spectrum is a promising technique that can achieve the stringent requirement of high data rate in the next-generation wireless networks, yet its severe propagation attenuation significantly hinders its implementation in practice.

reinforcement-learning

Paper
Add Code

Extending the Pre-Training of BLOOM for Improved Support of Traditional Chinese: Models, Methods and Results

1 code implementation • 8 Mar 2023 • Philipp Ennen, Po-chun Hsu, Chan-Jan Hsu, Chang-Le Liu, Yen-chen Wu, Yin-Hsiang Liao, Chin-Tung Lin, Da-Shan Shiu, Wei-Yun Ma

In this paper we present the multilingual language model BLOOM-zh that features enhanced support for Traditional Chinese.

Language Modelling

146

Paper
Code

Learning Phone Recognition from Unpaired Audio and Phone Sequences Based on Generative Adversarial Network

no code implementations • 29 Jul 2022 • Da-Rong Liu, Po-chun Hsu, Yi-Chen Chen, Sung-Feng Huang, Shun-Po Chuang, Da-Yi Wu, Hung-Yi Lee

GAN training is adopted in the first stage to find the mapping relationship between unpaired speech and phone sequence.

Acoustic Unit Discovery Generative Adversarial Network

Paper
Add Code

STOP: A dataset for Spoken Task Oriented Semantic Parsing

1 code implementation • 29 Jun 2022 • Paden Tomasello, Akshat Shrivastava, Daniel Lazar, Po-chun Hsu, Duc Le, Adithya Sagar, Ali Elkahky, Jade Copet, Wei-Ning Hsu, Yossi Adi, Robin Algayres, Tu Ahn Nguyen, Emmanuel Dupoux, Luke Zettlemoyer, Abdelrahman Mohamed

Furthermore, in addition to the human-recorded audio, we are releasing a TTS-generated version to benchmark the performance for low-resource domain adaptation of end-to-end SLU systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

29,224

Paper
Code

Silence is Sweeter Than Speech: Self-Supervised Model Using Silence to Store Speaker Information

no code implementations • 8 May 2022 • Chi-Luen Feng, Po-chun Hsu, Hung-Yi Lee

We found that HuBERT stores speaker information in representations whose positions correspond to silences in a waveform.

Self-Supervised Learning Speaker Identification

Paper
Add Code

Universal Adaptor: Converting Mel-Spectrograms Between Different Configurations for Speech Synthesis

1 code implementation • 1 Apr 2022 • Fan-Lin Wang, Po-chun Hsu, Da-Rong Liu, Hung-Yi Lee

Most recent speech synthesis systems are composed of a synthesizer and a vocoder.

Speech Synthesis Voice Conversion

Paper
Code

Adversarial Sample Detection for Speaker Verification by Neural Vocoders

1 code implementation • 1 Jul 2021 • Haibin Wu, Po-chun Hsu, Ji Gao, Shanshan Zhang, Shen Huang, Jian Kang, Zhiyong Wu, Helen Meng, Hung-Yi Lee

We also show that the neural vocoder adopted in the detection framework is dataset-independent.

Speaker Verification

Paper
Code

Investigating on Incorporating Pretrained and Learnable Speaker Representations for Multi-Speaker Multi-Style Text-to-Speech

1 code implementation • 6 Mar 2021 • Chung-Ming Chien, Jheng-Hao Lin, Chien-yu Huang, Po-chun Hsu, Hung-Yi Lee

The few-shot multi-speaker multi-style voice cloning task is to synthesize utterances with voice and speaking style similar to a reference speaker given only a few reference samples.

Voice Cloning Voice Conversion

1,609

Paper
Code

WG-WaveNet: Real-Time High-Fidelity Speech Synthesis without GPU

1 code implementation • 15 May 2020 • Po-chun Hsu, Hung-Yi Lee

As we design a flow-based model that is heavily compressed, the proposed model requires much less computational resources compared to other waveform generation models during both training and inference time; even though the model is highly compressed, the post-filter maintains the quality of generated waveform.

Speech Synthesis Text-To-Speech Synthesis Audio and Speech Processing Sound

Paper
Code

Towards Robust Neural Vocoding for Speech Generation: A Survey

no code implementations • 5 Dec 2019 • Po-chun Hsu, Chun-hsuan Wang, Andy T. Liu, Hung-Yi Lee

We found out that the speaker variety is much more important for achieving a universal vocoder than the language.

Speech Synthesis Voice Conversion

Paper
Add Code

Mockingjay: Unsupervised Speech Representation Learning with Deep Bidirectional Transformer Encoders

7 code implementations • 25 Oct 2019 • Andy T. Liu, Shu-wen Yang, Po-Han Chi, Po-chun Hsu, Hung-Yi Lee

We present Mockingjay as a new speech representation learning approach, where bidirectional Transformer encoders are pre-trained on a large amount of unlabeled speech.

General Classification Representation Learning +3

2,090

Paper
Code

Unsupervised End-to-End Learning of Discrete Linguistic Units for Voice Conversion

1 code implementation • 28 May 2019 • Andy T. Liu, Po-chun Hsu, Hung-Yi Lee

We found that the proposed encoding method offers automatic extraction of speech content from speaker style, and is sufficient to cover full linguistic content in a given language.

Voice Conversion

110

Paper
Code

Rhythm-Flexible Voice Conversion without Parallel Data Using Cycle-GAN over Phoneme Posteriorgram Sequences

1 code implementation • 9 Aug 2018 • Cheng-chieh Yeh, Po-chun Hsu, Ju-chieh Chou, Hung-Yi Lee, Lin-shan Lee

In this way, the length constraint mentioned above is removed to offer rhythm-flexible voice conversion without requiring parallel data.

Sound Audio and Speech Processing

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.