Search Results for author: Hung-Shin Lee

Found 15 papers, 4 papers with code

CasNet: Investigating Channel Robustness for Speech Separation

no code implementations27 Oct 2022 Fan-Lin Wang, Yao-Fei Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang

In this study, inheriting the use of our previously constructed TAT-2mix corpus, we address the channel mismatch problem by proposing a channel-aware audio separation network (CasNet), a deep learning framework for end-to-end time-domain speech separation.

Speech Separation

A Teacher-student Framework for Unsupervised Speech Enhancement Using Noise Remixing Training and Two-stage Inference

1 code implementation27 Oct 2022 Li-Wei Chen, Yao-Fei Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang

The lack of clean speech is a practical challenge to the development of speech enhancement systems, which means that the training of neural network models must be done in an unsupervised manner, and there is an inevitable mismatch between their training criterion and evaluation metric.

Speech Enhancement

Generation of Speaker Representations Using Heterogeneous Training Batch Assembly

no code implementations30 Mar 2022 Yu-Huai Peng, Hung-Shin Lee, Pin-Tuan Huang, Hsin-Min Wang

In traditional speaker diarization systems, a well-trained speaker model is a key component to extract representations from consecutive and partially overlapping segments in a long speech session.

speaker-diarization Speaker Diarization

Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks

1 code implementation30 Mar 2022 Fan-Lin Wang, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang

However, domain mismatch between training/test situations due to factors, such as speaker, content, channel, and environment, remains a severe problem for speech separation.

Speech Separation

Subspace-based Representation and Learning for Phonotactic Spoken Language Recognition

no code implementations28 Mar 2022 Hung-Shin Lee, Yu Tsao, Shyh-Kang Jeng, Hsin-Min Wang

Phonotactic constraints can be employed to distinguish languages by representing a speech utterance as a multinomial distribution or phone events.

Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

1 code implementation25 Mar 2022 Hung-Shin Lee, Pin-Yuan Chen, Yao-Fei Cheng, Yu Tsao, Hsin-Min Wang

In this paper, a noise-aware training framework based on two cascaded neural structures is proposed to jointly optimize speech enhancement and speech recognition.

Automatic Speech Recognition Robust Speech Recognition +2

Chain-based Discriminative Autoencoders for Speech Recognition

no code implementations25 Mar 2022 Hung-Shin Lee, Pin-Tuan Huang, Yao-Fei Cheng, Hsin-Min Wang

For application to robust speech recognition, we further extend c-DcAE to hierarchical and parallel structures, resulting in hc-DcAE and pc-DcAE.

Robust Speech Recognition speech-recognition

AlloST: Low-resource Speech Translation without Source Transcription

1 code implementation1 May 2021 Yao-Fei Cheng, Hung-Shin Lee, Hsin-Min Wang

In this study, we survey methods to improve ST performance without using source transcription, and propose a learning framework that utilizes a language-independent universal phone recognizer.


The Academia Sinica Systems of Voice Conversion for VCC2020

no code implementations6 Oct 2020 Yu-Huai Peng, Cheng-Hung Hu, Alexander Kang, Hung-Shin Lee, Pin-Yuan Chen, Yu Tsao, Hsin-Min Wang

This paper describes the Academia Sinica systems for the two tasks of Voice Conversion Challenge 2020, namely voice conversion within the same language (Task 1) and cross-lingual voice conversion (Task 2).

Voice Conversion

Cannot find the paper you are looking for? You can Submit a new open access paper.