Search Results for author: Cheng Yu

Found 26 papers, 8 papers with code

Cross-Utterance Conditioned VAE for Non-Autoregressive Text-to-Speech

1 code implementation ACL 2022 Yang Li, Cheng Yu, Guangzhi Sun, Hua Jiang, Fanglei Sun, Weiqin Zu, Ying Wen, Yang Yang, Jun Wang

Modelling prosody variation is critical for synthesizing natural and expressive speech in end-to-end text-to-speech (TTS) systems.

Perceptual Contrast Stretching on Target Feature for Speech Enhancement

1 code implementation31 Mar 2022 Rong Chao, Cheng Yu, Szu-Wei Fu, Xugang Lu, Yu Tsao

Specifically, the contrast of target features is stretched based on perceptual importance, thereby improving the overall SE performance.

Speech Enhancement

Conditional Diffusion Probabilistic Model for Speech Enhancement

2 code implementations10 Feb 2022 Yen-Ju Lu, Zhong-Qiu Wang, Shinji Watanabe, Alexander Richard, Cheng Yu, Yu Tsao

Speech enhancement is a critical component of many user-oriented audio applications, yet current systems still suffer from distorted and unnatural outputs.

Speech Enhancement Speech Synthesis

HASA-net: A non-intrusive hearing-aid speech assessment network

no code implementations10 Nov 2021 Hsin-Tien Chiang, Yi-Chiao Wu, Cheng Yu, Tomoki Toda, Hsin-Min Wang, Yih-Chun Hu, Yu Tsao

Without the need of a clean reference, non-intrusive speech assessment methods have caught great attention for objective evaluations.

OSSEM: one-shot speaker adaptive speech enhancement using meta learning

no code implementations10 Nov 2021 Cheng Yu, Szu-Wei Fu, Tsun-An Hsieh, Yu Tsao, Mirco Ravanelli

Although deep learning (DL) has achieved notable progress in speech enhancement (SE), further research is still required for a DL-based SE system to adapt effectively and efficiently to particular speakers.

Meta-Learning Speech Enhancement

SEOFP-NET: Compression and Acceleration of Deep Neural Networks for Speech Enhancement Using Sign-Exponent-Only Floating-Points

no code implementations8 Nov 2021 Yu-Chen Lin, Cheng Yu, Yi-Te Hsu, Szu-Wei Fu, Yu Tsao, Tei-Wei Kuo

In this paper, a novel sign-exponent-only floating-point network (SEOFP-NET) technique is proposed to compress the model size and accelerate the inference time for speech enhancement, a regression task of speech signal processing.

Model Compression regression +1

MetricGAN-U: Unsupervised speech enhancement/ dereverberation based only on noisy/ reverberated speech

1 code implementation12 Oct 2021 Szu-Wei Fu, Cheng Yu, Kuo-Hsuan Hung, Mirco Ravanelli, Yu Tsao

Most of the deep learning-based speech enhancement models are learned in a supervised manner, which implies that pairs of noisy and clean speech are required during training.

Speech Enhancement

Mutual Information Continuity-constrained Estimator

no code implementations29 Sep 2021 Tsun-An Hsieh, Cheng Yu, Ying Hung, Chung-Ching Lin, Yu Tsao

Accordingly, we propose Mutual Information Continuity-constrained Estimator (MICE).

Density Estimation

Diverse Similarity Encoder for Deep GAN Inversion

2 code implementations23 Aug 2021 Cheng Yu, Wenmin Wang

Current deep generative adversarial networks (GANs) can synthesize high-quality (HQ) images, so learning representation with GANs is favorable.

Image Reconstruction

Speech Recovery for Real-World Self-powered Intermittent Devices

no code implementations9 Jun 2021 Yu-Chen Lin, Tsun-An Hsieh, Kuo-Hsuan Hung, Cheng Yu, Harinath Garudadri, Yu Tsao, Tei-Wei Kuo

The incompleteness of speech inputs severely degrades the performance of all the related speech signal processing applications.

MetricGAN+: An Improved Version of MetricGAN for Speech Enhancement

3 code implementations8 Apr 2021 Szu-Wei Fu, Cheng Yu, Tsun-An Hsieh, Peter Plantinga, Mirco Ravanelli, Xugang Lu, Yu Tsao

The discrepancy between the cost function used for training a speech enhancement model and human auditory perception usually makes the quality of enhanced speech unsatisfactory.

Speech Enhancement

Global ill-posedness for a dense set of initial data to the Isentropic system of gas dynamics

no code implementations8 Mar 2021 Robin Ming Chen, Alexis F. Vasseur, Cheng Yu

In dimension $n=2$ and $3$, we show that for any initial datum belonging to a dense subset of the energy space, there exist infinitely many global-in-time admissible weak solutions to the isentropic Euler system whenever $1<\gamma\leq 1+\frac2n$.

Analysis of PDEs 35Q31, 76N10, 35L65

Inviscid limit of the inhomogeneous incompressible Navier-Stokes equations under the weak Kolmogorov hypothesis in $\mathbb{R}^3$

no code implementations4 Feb 2021 Dixi Wang, Cheng Yu, Xinhua Zhao

In this paper, we consider the inviscid limit of inhomogeneous incompressible Navier-Stokes equations under the weak Kolmogorov hypothesis in $\mathbb{R}^3$.

Analysis of PDEs

Dissipative solutions to the compressible isentropic Navier-Stokes equations

no code implementations4 Feb 2021 Liang Guo, Ducati Li, Cheng Yu

The existence of dissipative solutions to the compressible isentropic Navier-Stokes equations was established in this paper.

Analysis of PDEs

Attention-based multi-task learning for speech-enhancement and speaker-identification in multi-speaker dialogue scenario

1 code implementation7 Jan 2021 Chiang-Jen Peng, Yun-Ju Chan, Cheng Yu, Syu-Siang Wang, Yu Tsao, Tai-Shih Chi

In this study, we propose an attention-based MTL (ATM) approach that integrates MTL and the attention-weighting mechanism to simultaneously realize a multi-model learning structure that performs speech enhancement (SE) and speaker identification (SI).

Multi-Task Learning Speaker Identification +1

Defending Against Universal Adversarial Patches by Clipping Feature Norms

no code implementations ICCV 2021 Cheng Yu, Jiansheng Chen, Youze Xue, Yuyang Liu, Weitao Wan, Jiayu Bao, Huimin Ma

Physical-world adversarial attacks based on universal adversarial patches have been proved to be able to mislead deep convolutional neural networks (CNNs), exposing the vulnerability of real-world visual classification systems based on CNNs.

Speech Enhancement Guided by Contextual Articulatory Information

no code implementations15 Nov 2020 Yen-Ju Lu, Chia-Yu Chang, Cheng Yu, Ching-Feng Liu, Jeih-weih Hung, Shinji Watanabe, Yu Tsao

Previous studies have confirmed that by augmenting acoustic features with the place/manner of articulatory features, the speech enhancement (SE) process can be guided to consider the articulatory properties of the input speech when performing enhancement to attain performance improvements.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Improving Perceptual Quality by Phone-Fortified Perceptual Loss using Wasserstein Distance for Speech Enhancement

1 code implementation28 Oct 2020 Tsun-An Hsieh, Cheng Yu, Szu-Wei Fu, Xugang Lu, Yu Tsao

Speech enhancement (SE) aims to improve speech quality and intelligibility, which are both related to a smooth transition in speech segments that may carry linguistic information, e. g. phones and syllables.

Speech Enhancement

Multi-Scale Networks for 3D Human Pose Estimation with Inference Stage Optimization

no code implementations13 Oct 2020 Cheng Yu, Bo wang, Bo Yang, Robby T. Tan

Addressing these problems, we introduce a spatio-temporal network for robust 3D human pose estimation.

2D Pose Estimation Pose Prediction

Boosting Objective Scores of a Speech Enhancement Model by MetricGAN Post-processing

no code implementations18 Jun 2020 Szu-Wei Fu, Chien-Feng Liao, Tsun-An Hsieh, Kuo-Hsuan Hung, Syu-Siang Wang, Cheng Yu, Heng-Cheng Kuo, Ryandhimas E. Zezario, You-Jin Li, Shang-Yi Chuang, Yen-Ju Lu, Yu Tsao

The Transformer architecture has demonstrated a superior ability compared to recurrent neural networks in many different natural language processing applications.

Speech Enhancement

Speech Enhancement based on Denoising Autoencoder with Multi-branched Encoders

no code implementations6 Jan 2020 Cheng Yu, Ryandhimas E. Zezario, Jonathan Sherman, Yi-Yen Hsieh, Xugang Lu, Hsin-Min Wang, Yu Tsao

The DSDT is built based on a prior knowledge of speech and noisy conditions (the speaker, environment, and signal factors are considered in this paper), where each component of the multi-branched encoder performs a particular mapping from noisy to clean speech along the branch in the DSDT.

Denoising Speech Enhancement

Time-Domain Multi-modal Bone/air Conducted Speech Enhancement

no code implementations22 Nov 2019 Cheng Yu, Kuo-Hsuan Hung, Syu-Siang Wang, Szu-Wei Fu, Yu Tsao, Jeih-weih Hung

Previous studies have proven that integrating video signals, as a complementary modality, can facilitate improved performance for speech enhancement (SE).

Ensemble Learning Speech Enhancement

Increasing Compactness Of Deep Learning Based Speech Enhancement Models With Parameter Pruning And Quantization Techniques

no code implementations31 May 2019 Jyun-Yi Wu, Cheng Yu, Szu-Wei Fu, Chih-Ting Liu, Shao-Yi Chien, Yu Tsao

In addition, a parameter quantization (PQ) technique was applied to reduce the size of a neural network by representing weights with fewer cluster centroids.

Denoising Quantization +1

Cannot find the paper you are looking for? You can Submit a new open access paper.