Search Results for author: Hung-Shin Lee

Found 16 papers, 5 papers with code

The North System for Formosa Speech Recognition Challenge 2023

no code implementations • 5 Oct 2023 • Li-Wei Chen, Kai-Chen Cheng, Hung-Shin Lee

This report provides a concise overview of the proposed North system, which aims to achieve automatic word/syllable recognition for Taiwanese Hakka (Sixian).

speech-recognition Speech Recognition

Paper
Add Code

A Training and Inference Strategy Using Noisy and Enhanced Speech as Target for Speech Enhancement without Clean Speech

2 code implementations • 27 Oct 2022 • Li-Wei Chen, Yao-Fei Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang

The lack of clean speech is a practical challenge to the development of speech enhancement systems, which means that there is an inevitable mismatch between their training criterion and evaluation metric.

Speech Enhancement

Paper
Code

CasNet: Investigating Channel Robustness for Speech Separation

1 code implementation • 27 Oct 2022 • Fan-Lin Wang, Yao-Fei Cheng, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang

In this study, inheriting the use of our previously constructed TAT-2mix corpus, we address the channel mismatch problem by proposing a channel-aware audio separation network (CasNet), a deep learning framework for end-to-end time-domain speech separation.

Speech Separation

Paper
Code

Filter-based Discriminative Autoencoders for Children Speech Recognition

no code implementations • 1 Apr 2022 • Chiang-Lin Tai, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang

Children speech recognition is indispensable but challenging due to the diversity of children's speech.

Domain Adaptation speech-recognition +1

Paper
Add Code

Disentangling the Impacts of Language and Channel Variability on Speech Separation Networks

1 code implementation • 30 Mar 2022 • Fan-Lin Wang, Hung-Shin Lee, Yu Tsao, Hsin-Min Wang

However, domain mismatch between training/test situations due to factors, such as speaker, content, channel, and environment, remains a severe problem for speech separation.

Speech Separation

Paper
Code

Generation of Speaker Representations Using Heterogeneous Training Batch Assembly

no code implementations • 30 Mar 2022 • Yu-Huai Peng, Hung-Shin Lee, Pin-Tuan Huang, Hsin-Min Wang

In traditional speaker diarization systems, a well-trained speaker model is a key component to extract representations from consecutive and partially overlapping segments in a long speech session.

speaker-diarization Speaker Diarization

Paper
Add Code

Subspace-based Representation and Learning for Phonotactic Spoken Language Recognition

no code implementations • 28 Mar 2022 • Hung-Shin Lee, Yu Tsao, Shyh-Kang Jeng, Hsin-Min Wang

Phonotactic constraints can be employed to distinguish languages by representing a speech utterance as a multinomial distribution or phone events.

Paper
Add Code

Chain-based Discriminative Autoencoders for Speech Recognition

no code implementations • 25 Mar 2022 • Hung-Shin Lee, Pin-Tuan Huang, Yao-Fei Cheng, Hsin-Min Wang

For application to robust speech recognition, we further extend c-DcAE to hierarchical and parallel structures, resulting in hc-DcAE and pc-DcAE.

Robust Speech Recognition speech-recognition

Paper
Add Code

Speech-enhanced and Noise-aware Networks for Robust Speech Recognition

1 code implementation • 25 Mar 2022 • Hung-Shin Lee, Pin-Yuan Chen, Yao-Fei Cheng, Yu Tsao, Hsin-Min Wang

In this paper, a noise-aware training framework based on two cascaded neural structures is proposed to jointly optimize speech enhancement and speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Dual-Path Filter Network: Speaker-Aware Modeling for Speech Separation

no code implementations • 14 Jun 2021 • Fan-Lin Wang, Yu-Huai Peng, Hung-Shin Lee, Hsin-Min Wang

DPFN is composed of two parts: the speaker module and the separation module.

Speech Separation

Paper
Add Code

Relational Data Selection for Data Augmentation of Speaker-dependent Multi-band MelGAN Vocoder

no code implementations • 10 Jun 2021 • Yi-Chiao Wu, Cheng-Hung Hu, Hung-Shin Lee, Yu-Huai Peng, Wen-Chin Huang, Yu Tsao, Hsin-Min Wang, Tomoki Toda

Nowadays, neural vocoders can generate very high-fidelity speech when a bunch of training data is available.

Data Augmentation Speaker Verification

Paper
Add Code

AlloST: Low-resource Speech Translation without Source Transcription

1 code implementation • 1 May 2021 • Yao-Fei Cheng, Hung-Shin Lee, Hsin-Min Wang

In this study, we survey methods to improve ST performance without using source transcription, and propose a learning framework that utilizes a language-independent universal phone recognizer.

Translation

Paper
Code

The Academia Sinica Systems of Voice Conversion for VCC2020

no code implementations • 6 Oct 2020 • Yu-Huai Peng, Cheng-Hung Hu, Alexander Kang, Hung-Shin Lee, Pin-Yuan Chen, Yu Tsao, Hsin-Min Wang

This paper describes the Academia Sinica systems for the two tasks of Voice Conversion Challenge 2020, namely voice conversion within the same language (Task 1) and cross-lingual voice conversion (Task 2).

Task 2 Voice Conversion