Search Results for author: Kuo-Hsuan Hung

Found 11 papers, 3 papers with code

Audio-Visual Speech Enhancement and Separation by Leveraging Multi-Modal Self-Supervised Embeddings

no code implementations31 Oct 2022 I-Chun Chern, Kuo-Hsuan Hung, Yi-Ting Chen, Tassadaq Hussain, Mandar Gogate, Amir Hussain, Yu Tsao, Jen-Cheng Hou

In summary, our results confirm the effectiveness of our proposed model for the AVSS task with proper fine-tuning strategies, demonstrating that multi-modal self-supervised embeddings obtained from AV-HUBERT can be generalized to audio-visual regression tasks.

Automatic Speech Recognition Lip Reading +5

Boosting Self-Supervised Embeddings for Speech Enhancement

1 code implementation7 Apr 2022 Kuo-Hsuan Hung, Szu-Wei Fu, Huan-Hsin Tseng, Hsin-Tien Chiang, Yu Tsao, Chii-Wann Lin

We further study the relationship between the noise robustness of SSL representation via clean-noisy distance (CN distance) and the layer importance for SE.

Self-Supervised Learning Speech Enhancement

MetricGAN-U: Unsupervised speech enhancement/ dereverberation based only on noisy/ reverberated speech

1 code implementation12 Oct 2021 Szu-Wei Fu, Cheng Yu, Kuo-Hsuan Hung, Mirco Ravanelli, Yu Tsao

Most of the deep learning-based speech enhancement models are learned in a supervised manner, which implies that pairs of noisy and clean speech are required during training.

Speech Enhancement

Speech Recovery for Real-World Self-powered Intermittent Devices

no code implementations9 Jun 2021 Yu-Chen Lin, Tsun-An Hsieh, Kuo-Hsuan Hung, Cheng Yu, Harinath Garudadri, Yu Tsao, Tei-Wei Kuo

The incompleteness of speech inputs severely degrades the performance of all the related speech signal processing applications.

EMA2S: An End-to-End Multimodal Articulatory-to-Speech System

no code implementations7 Feb 2021 Yu-Wen Chen, Kuo-Hsuan Hung, Shang-Yi Chuang, Jonathan Sherman, Wen-Chin Huang, Xugang Lu, Yu Tsao

Synthesized speech from articulatory movements can have real-world use for patients with vocal cord disorders, situations requiring silent speech, or in high-noise environments.

A Study of Incorporating Articulatory Movement Information in Speech Enhancement

no code implementations3 Nov 2020 Yu-Wen Chen, Kuo-Hsuan Hung, Shang-Yi Chuang, Jonathan Sherman, Xugang Lu, Yu Tsao

Although deep learning algorithms are widely used for improving speech enhancement (SE) performance, the performance remains limited under highly challenging conditions, such as unseen noise or noise signals having low signal-to-noise ratios (SNRs).

Speech Enhancement

CITISEN: A Deep Learning-Based Speech Signal-Processing Mobile Application

1 code implementation21 Aug 2020 Yu-Wen Chen, Kuo-Hsuan Hung, You-Jin Li, Alexander Chao-Fu Kang, Ya-Hsin Lai, Kai-Chun Liu, Szu-Wei Fu, Syu-Siang Wang, Yu Tsao

The CITISEN provides three functions: speech enhancement (SE), model adaptation (MA), and background noise conversion (BNC), allowing CITISEN to be used as a platform for utilizing and evaluating SE models and flexibly extend the models to address various noise environments and users.

Acoustic Scene Classification Data Augmentation +2

Boosting Objective Scores of a Speech Enhancement Model by MetricGAN Post-processing

no code implementations18 Jun 2020 Szu-Wei Fu, Chien-Feng Liao, Tsun-An Hsieh, Kuo-Hsuan Hung, Syu-Siang Wang, Cheng Yu, Heng-Cheng Kuo, Ryandhimas E. Zezario, You-Jin Li, Shang-Yi Chuang, Yen-Ju Lu, Yu Tsao

The Transformer architecture has demonstrated a superior ability compared to recurrent neural networks in many different natural language processing applications.

Speech Enhancement

Time-Domain Multi-modal Bone/air Conducted Speech Enhancement

no code implementations22 Nov 2019 Cheng Yu, Kuo-Hsuan Hung, Syu-Siang Wang, Szu-Wei Fu, Yu Tsao, Jeih-weih Hung

Previous studies have proven that integrating video signals, as a complementary modality, can facilitate improved performance for speech enhancement (SE).

Ensemble Learning Speech Enhancement

Cannot find the paper you are looking for? You can Submit a new open access paper.