Search Results for author: Shengkui Zhao

Found 7 papers, 5 papers with code

Towards Natural Bilingual and Code-Switched Speech Synthesis Based on Mix of Monolingual Recordings and Cross-Lingual Voice Conversion

1 code implementation • 16 Oct 2020 • Shengkui Zhao, Trung Hieu Nguyen, Hao Wang, Bin Ma

With these data, three neural TTS models -- Tacotron2, Transformer and FastSpeech are applied for building bilingual and code-switched TTS.

Speech Synthesis Voice Conversion

7,871

Paper
Code

Monaural Speech Enhancement with Complex Convolutional Block Attention Module and Joint Time Frequency Losses

1 code implementation • 3 Feb 2021 • Shengkui Zhao, Trung Hieu Nguyen, Bin Ma

In this paper, we propose a complex convolutional block attention module (CCBAM) to boost the representation power of the complex-valued convolutional layers by constructing more informative features.

Ranked #1 on Speech Enhancement on DNS Challenge

Speech Enhancement

Paper
Code

Towards Natural and Controllable Cross-Lingual Voice Conversion Based on Neural TTS Model and Phonetic Posteriorgram

no code implementations • 3 Feb 2021 • Shengkui Zhao, Hao Wang, Trung Hieu Nguyen, Bin Ma

Cross-lingual voice conversion (VC) is an important and challenging problem due to significant mismatches of the phonetic set and the speech prosody of different languages.

Voice Conversion

Paper
Add Code

End-to-End Complex-Valued Multidilated Convolutional Neural Network for Joint Acoustic Echo Cancellation and Noise Suppression

no code implementations • 2 Oct 2021 • Karn N. Watcharasupat, Thi Ngoc Tho Nguyen, Woon-Seng Gan, Shengkui Zhao, Bin Ma

We also propose a dual-mask technique for joint echo and noise suppression with simultaneous speech enhancement.

Acoustic echo cancellation Speech Enhancement

Paper
Add Code

MossFormer: Pushing the Performance Limit of Monaural Speech Separation using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions

1 code implementation • 23 Feb 2023 • Shengkui Zhao, Bin Ma

To effectively solve the indirect elemental interactions across chunks in the dual-path architecture, MossFormer employs a joint local and global self-attention architecture that simultaneously performs a full-computation self-attention on local chunks and a linearised low-cost self-attention over the full sequence.

Ranked #2 on Speech Separation on WHAMR!

Speech Separation

Paper
Code

ACA-Net: Towards Lightweight Speaker Verification using Asymmetric Cross Attention

1 code implementation • 20 May 2023 • Jia Qi Yip, Tuan Truong, Dianwen Ng, Chong Zhang, Yukun Ma, Trung Hieu Nguyen, Chongjia Ni, Shengkui Zhao, Eng Siong Chng, Bin Ma

In this paper, we propose ACA-Net, a lightweight, global context-aware speaker embedding extractor for Speaker Verification (SV) that improves upon existing work by using Asymmetric Cross Attention (ACA) to replace temporal pooling.

Speaker Verification

Paper
Code

SPGM: Prioritizing Local Features for enhanced speech separation performance

1 code implementation • 22 Sep 2023 • Jia Qi Yip, Shengkui Zhao, Yukun Ma, Chongjia Ni, Chong Zhang, Hao Wang, Trung Hieu Nguyen, Kun Zhou, Dianwen Ng, Eng Siong Chng, Bin Ma

Dual-path is a popular architecture for speech separation models (e. g. Sepformer) which splits long sequences into overlapping chunks for its intra- and inter-blocks that separately model intra-chunk local features and inter-chunk global relationships.

Ranked #5 on Speech Separation on WSJ0-2mix

Speech Separation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.