Search Results for author: Yujun Wang

Found 15 papers, 6 papers with code

Enhanced Neural Beamformer with Spatial Information for Target Speech Extraction

no code implementations28 Jun 2023 Aoqi Guo, Junnan Wu, Peng Gao, Wenbo Zhu, Qinwen Guo, Dazhi Gao, Yujun Wang

In this paper, we propose a target speech extraction network that utilizes spatial information to enhance the performance of neural beamformer.

Dimensionality Reduction Speech Extraction

Exploring Representation Learning for Small-Footprint Keyword Spotting

no code implementations20 Mar 2023 Fan Cui, Liyong Guo, Quandong Wang, Peng Gao, Yujun Wang

To address those challenges, we explore representation learning for KWS by self-supervised contrastive learning and self-training with pretrained model.

Contrastive Learning Representation Learning +1

Improving Weakly Supervised Sound Event Detection with Causal Intervention

no code implementations10 Mar 2023 Yifei Xin, Dongchao Yang, Fan Cui, Yujun Wang, Yuexian Zou

Existing weakly supervised sound event detection (WSSED) work has not explored both types of co-occurrences simultaneously, i. e., some sound events often co-occur, and their occurrences are usually accompanied by specific background sounds, so they would be inevitably entangled, causing misclassification and biased localization results with only clip-level supervision.

Event Detection Sound Event Detection

Improve Bilingual TTS Using Dynamic Language and Phonology Embedding

no code implementations7 Dec 2022 Fengyu Yang, Jian Luan, Yujun Wang

We introduce phonology embedding to capture the English differences between different phonology.

Learning Decoupling Features Through Orthogonality Regularization

no code implementations31 Mar 2022 Li Wang, Rongzhi Gu, Weiji Zhuang, Peng Gao, Yujun Wang, Yuexian Zou

Bearing this in mind, a two-branch deep network (KWS branch and SV branch) with the same network structure is developed and a novel decoupling feature learning method is proposed to push up the performance of KWS and SV simultaneously where speaker-invariant keyword representations and keyword-invariant speaker representations are expected respectively.

Keyword Spotting Speaker Verification

PAMA-TTS: Progression-Aware Monotonic Attention for Stable Seq2Seq TTS With Accurate Phoneme Duration Control

no code implementations9 Oct 2021 Yunchao He, Jian Luan, Yujun Wang

Sequence expansion between encoder and decoder is a critical challenge in sequence-to-sequence TTS.

GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio

2 code implementations13 Jun 2021 Guoguo Chen, Shuzhou Chai, Guanbo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Yujun Wang, Zhao You, Zhiyong Yan

This paper introduces GigaSpeech, an evolving, multi-domain English speech recognition corpus with 10, 000 hours of high quality labeled audio suitable for supervised training, and 40, 000 hours of total audio suitable for semi-supervised and unsupervised training.

speech-recognition Speech Recognition

speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment

2 code implementations3 Apr 2021 Junbo Zhang, Zhiwen Zhang, Yongqing Wang, Zhiyong Yan, Qiong Song, YuKai Huang, Ke Li, Daniel Povey, Yujun Wang

This paper introduces a new open-source speech corpus named "speechocean762" designed for pronunciation assessment use, consisting of 5000 English utterances from 250 non-native speakers, where half of the speakers are children.

Phone-level pronunciation scoring speech-recognition

RawNet: Fast End-to-End Neural Vocoder

1 code implementation10 Apr 2019 Yunchao He, Yujun Wang

Neural network-based vocoders have recently demonstrated the powerful ability to synthesize high-quality speech.

Speech Synthesis

Attention-based End-to-End Models for Small-Footprint Keyword Spotting

2 code implementations29 Mar 2018 Changhao Shan, Junbo Zhang, Yujun Wang, Lei Xie

In this paper, we propose an attention-based end-to-end neural approach for small-footprint keyword spotting (KWS), which aims to simplify the pipelines of building a production-quality KWS system.

Small-Footprint Keyword Spotting

Empirical Evaluation of Speaker Adaptation on DNN based Acoustic Model

1 code implementation27 Mar 2018 Ke Wang, Junbo Zhang, Yujun Wang, Lei Xie

Speaker adaptation aims to estimate a speaker specific acoustic model from a speaker independent one to minimize the mismatch between the training and testing conditions arisen from speaker variabilities.

Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition

1 code implementation27 Mar 2018 Ke Wang, Junbo Zhang, Sining Sun, Yujun Wang, Fei Xiang, Lei Xie

First, we study the effectiveness of different dereverberation networks (the generator in GAN) and find that LSTM leads a significant improvement as compared with feed-forward DNN and CNN in our dataset.

Robust Speech Recognition Speech Dereverberation +1

Attention-Based End-to-End Speech Recognition on Voice Search

no code implementations22 Jul 2017 Changhao Shan, Junbo Zhang, Yujun Wang, Lei Xie

Previous attempts have shown that applying attention-based encoder-decoder to Mandarin speech recognition was quite difficult due to the logographic orthography of Mandarin, the large vocabulary and the conditional dependency of the attention model.

L2 Regularization Language Modelling +2

Cannot find the paper you are looking for? You can Submit a new open access paper.