Search Results for author: Yujun Wang

Found 15 papers, 6 papers with code

Enhanced Neural Beamformer with Spatial Information for Target Speech Extraction

no code implementations • 28 Jun 2023 • Aoqi Guo, Junnan Wu, Peng Gao, Wenbo Zhu, Qinwen Guo, Dazhi Gao, Yujun Wang

In this paper, we propose a target speech extraction network that utilizes spatial information to enhance the performance of neural beamformer.

Dimensionality Reduction Speech Extraction

Paper
Add Code

Exploring Representation Learning for Small-Footprint Keyword Spotting

no code implementations • 20 Mar 2023 • Fan Cui, Liyong Guo, Quandong Wang, Peng Gao, Yujun Wang

To address those challenges, we explore representation learning for KWS by self-supervised contrastive learning and self-training with pretrained model.

Contrastive Learning Representation Learning +1

Paper
Add Code

Relate auditory speech to EEG by shallow-deep attention-based network

no code implementations • 20 Mar 2023 • Fan Cui, Liyong Guo, Lang He, Jiyao Liu, Ercheng Pei, Yujun Wang, Dongmei Jiang

Electroencephalography (EEG) plays a vital role in detecting how brain responses to different stimulus.

Data Augmentation Deep Attention +1

Paper
Add Code

Improving Weakly Supervised Sound Event Detection with Causal Intervention

no code implementations • 10 Mar 2023 • Yifei Xin, Dongchao Yang, Fan Cui, Yujun Wang, Yuexian Zou

Existing weakly supervised sound event detection (WSSED) work has not explored both types of co-occurrences simultaneously, i. e., some sound events often co-occur, and their occurrences are usually accompanied by specific background sounds, so they would be inevitably entangled, causing misclassification and biased localization results with only clip-level supervision.

Event Detection Sound Event Detection

Paper
Add Code

Improve Bilingual TTS Using Dynamic Language and Phonology Embedding

no code implementations • 7 Dec 2022 • Fengyu Yang, Jian Luan, Yujun Wang

We introduce phonology embedding to capture the English differences between different phonology.

Paper
Add Code

Learning Decoupling Features Through Orthogonality Regularization

no code implementations • 31 Mar 2022 • Li Wang, Rongzhi Gu, Weiji Zhuang, Peng Gao, Yujun Wang, Yuexian Zou

Bearing this in mind, a two-branch deep network (KWS branch and SV branch) with the same network structure is developed and a novel decoupling feature learning method is proposed to push up the performance of KWS and SV simultaneously where speaker-invariant keyword representations and keyword-invariant speaker representations are expected respectively.

Keyword Spotting Speaker Verification

Paper
Add Code

PAMA-TTS: Progression-Aware Monotonic Attention for Stable Seq2Seq TTS With Accurate Phoneme Duration Control

no code implementations • 9 Oct 2021 • Yunchao He, Jian Luan, Yujun Wang

Sequence expansion between encoder and decoder is a critical challenge in sequence-to-sequence TTS.

Paper
Add Code

GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio

2 code implementations • 13 Jun 2021 • Guoguo Chen, Shuzhou Chai, Guanbo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Yujun Wang, Zhao You, Zhiyong Yan

This paper introduces GigaSpeech, an evolving, multi-domain English speech recognition corpus with 10, 000 hours of high quality labeled audio suitable for supervised training, and 40, 000 hours of total audio suitable for semi-supervised and unsupervised training.

Ranked #1 on Speech Recognition on GigaSpeech

Sentence speech-recognition +1

598

Paper
Code

speechocean762: An Open-Source Non-native English Speech Corpus For Pronunciation Assessment

2 code implementations • 3 Apr 2021 • Junbo Zhang, Zhiwen Zhang, Yongqing Wang, Zhiyong Yan, Qiong Song, YuKai Huang, Ke Li, Daniel Povey, Yujun Wang

This paper introduces a new open-source speech corpus named "speechocean762" designed for pronunciation assessment use, consisting of 5000 English utterances from 250 non-native speakers, where half of the speakers are children.

Ranked #7 on Phone-level pronunciation scoring on speechocean762

Phone-level pronunciation scoring Sentence +1

13,701

Paper
Code

AutoKWS: Keyword Spotting with Differentiable Architecture Search

no code implementations • 8 Sep 2020 • Bo Zhang, Wenfeng Li, Qingyuan Li, Weiji Zhuang, Xiangxiang Chu, Yujun Wang

Smart audio devices are gated by an always-on lightweight keyword spotting program to reduce power consumption.

Keyword Spotting Neural Architecture Search

Paper
Add Code

RawNet: Fast End-to-End Neural Vocoder

1 code implementation • 10 Apr 2019 • Yunchao He, Yujun Wang

Neural network-based vocoders have recently demonstrated the powerful ability to synthesize high-quality speech.

Speech Synthesis

Paper
Code

Attention-based End-to-End Models for Small-Footprint Keyword Spotting

3 code implementations • 29 Mar 2018 • Changhao Shan, Junbo Zhang, Yujun Wang, Lei Xie

In this paper, we propose an attention-based end-to-end neural approach for small-footprint keyword spotting (KWS), which aims to simplify the pipelines of building a production-quality KWS system.

Small-Footprint Keyword Spotting

Paper
Code

Empirical Evaluation of Speaker Adaptation on DNN based Acoustic Model

1 code implementation • 27 Mar 2018 • Ke Wang, Junbo Zhang, Yujun Wang, Lei Xie

Speaker adaptation aims to estimate a speaker specific acoustic model from a speaker independent one to minimize the mismatch between the training and testing conditions arisen from speaker variabilities.

Paper
Code

Investigating Generative Adversarial Networks based Speech Dereverberation for Robust Speech Recognition

1 code implementation • 27 Mar 2018 • Ke Wang, Junbo Zhang, Sining Sun, Yujun Wang, Fei Xiang, Lei Xie

First, we study the effectiveness of different dereverberation networks (the generator in GAN) and find that LSTM leads a significant improvement as compared with feed-forward DNN and CNN in our dataset.

Robust Speech Recognition Speech Dereverberation +1

Paper
Code

Attention-Based End-to-End Speech Recognition on Voice Search

no code implementations • 22 Jul 2017 • Changhao Shan, Junbo Zhang, Yujun Wang, Lei Xie

Previous attempts have shown that applying attention-based encoder-decoder to Mandarin speech recognition was quite difficult due to the logographic orthography of Mandarin, the large vocabulary and the conditional dependency of the attention model.

L2 Regularization Language Modelling +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.