Search Results for author: Hung-Shin Lee

Found 16 papers, 5 papers with code

The Academia Sinica Systems of Voice Conversion for VCC2020

no code implementations6 Oct 2020 Yu-Huai Peng, Cheng-Hung Hu, Alexander Kang, Hung-Shin Lee, Pin-Yuan Chen, Yu Tsao, Hsin-Min Wang

This paper describes the Academia Sinica systems for the two tasks of Voice Conversion Challenge 2020, namely voice conversion within the same language (Task 1) and cross-lingual voice conversion (Task 2).

Task 2 Voice Conversion

AlloST: Low-resource Speech Translation without Source Transcription

1 code implementation1 May 2021 Yao-Fei Cheng, Hung-Shin Lee, Hsin-Min Wang

In this study, we survey methods to improve ST performance without using source transcription, and propose a learning framework that utilizes a language-independent universal phone recognizer.

Translation

Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

1 code implementation25 Mar 2022 Hung-Shin Lee, Pin-Yuan Chen, Yao-Fei Cheng, Yu Tsao, Hsin-Min Wang

In this paper, a noise-aware training framework based on two cascaded neural structures is proposed to jointly optimize speech enhancement and speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Chain-based Discriminative Autoencoders for Speech Recognition

no code implementations25 Mar 2022 Hung-Shin Lee, Pin-Tuan Huang, Yao-Fei Cheng, Hsin-Min Wang

For application to robust speech recognition, we further extend c-DcAE to hierarchical and parallel structures, resulting in hc-DcAE and pc-DcAE.

Robust Speech Recognition speech-recognition

Subspace-based Representation and Learning for Phonotactic Spoken Language Recognition

no code implementations28 Mar 2022 Hung-Shin Lee, Yu Tsao, Shyh-Kang Jeng, Hsin-Min Wang

Phonotactic constraints can be employed to distinguish languages by representing a speech utterance as a multinomial distribution or phone events.

Generation of Speaker Representations Using Heterogeneous Training Batch Assembly

no code implementations30 Mar 2022 Yu-Huai Peng, Hung-Shin Lee, Pin-Tuan Huang, Hsin-Min Wang

In traditional speaker diarization systems, a well-trained speaker model is a key component to extract representations from consecutive and partially overlapping segments in a long speech session.

speaker-diarization Speaker Diarization

Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks

1 code implementation30 Mar 2022 Fan-Lin Wang, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang

However, domain mismatch between training/test situations due to factors, such as speaker, content, channel, and environment, remains a severe problem for speech separation.

Speech Separation

CasNet: Investigating Channel Robustness for Speech Separation

1 code implementation27 Oct 2022 Fan-Lin Wang, Yao-Fei Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang

In this study, inheriting the use of our previously constructed TAT-2mix corpus, we address the channel mismatch problem by proposing a channel-aware audio separation network (CasNet), a deep learning framework for end-to-end time-domain speech separation.

Speech Separation

A Training and Inference Strategy Using Noisy and Enhanced Speech as Target for Speech Enhancement without Clean Speech

2 code implementations27 Oct 2022 Li-Wei Chen, Yao-Fei Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang

The lack of clean speech is a practical challenge to the development of speech enhancement systems, which means that there is an inevitable mismatch between their training criterion and evaluation metric.

Speech Enhancement

The North System for Formosa Speech Recognition Challenge 2023

no code implementations5 Oct 2023 Li-Wei Chen, Kai-Chen Cheng, Hung-Shin Lee

This report provides a concise overview of the proposed North system, which aims to achieve automatic word/syllable recognition for Taiwanese Hakka (Sixian).

speech-recognition Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.