Why does Self-Supervised Learning for Speech Recognition Benefit Speaker Recognition?

no code implementations27 Apr 2022 Sanyuan Chen, Yu Wu, Chengyi Wang, Shujie Liu, Zhuo Chen, Peidong Wang, Gang Liu, Jinyu Li, Jian Wu, Xiangzhan Yu, Furu Wei

Recently, self-supervised learning (SSL) has demonstrated strong performance in speaker recognition, even if the pre-training objective is designed for speech recognition.

Self-Supervised Learning Speaker Recognition +2

A Conformer Based Acoustic Model for Robust Automatic Speech Recognition

no code implementations1 Mar 2022 Yufeng Yang, Peidong Wang, DeLiang Wang

The proposed model builds on a state-of-the-art recognition system using a bi-directional long short-term memory (BLSTM) model with utterance-wise dropout and iterative speaker adaptation, but employs a Conformer encoder instead of the BLSTM network.

Automatic Speech Recognition

Predicting Atlantic Multidecadal Variability

no code implementations29 Oct 2021 Glenn Liu, Peidong Wang, Matthew Beveridge, Young-Oh Kwon, Iddo Drori

Atlantic Multidecadal Variability (AMV) describes variations of North Atlantic sea surface temperature with a typical cycle of between 60 and 70 years.

Continuous Speech Separation with Recurrent Selective Attention Network

no code implementations28 Oct 2021 Yixuan Zhang, Zhuo Chen, Jian Wu, Takuya Yoshioka, Peidong Wang, Zhong Meng, Jinyu Li

In this paper, we propose to apply recurrent selective attention network (RSAN) to CSS, which generates a variable number of output channels based on active speaker counting.

Speech Recognition Speech Separation

FloW: A Dataset and Benchmark for Floating Waste Detection in Inland Waters

1 code implementation ICCV 2021 Yuwei Cheng, Jiannan Zhu, Mengxin Jiang, Jie Fu, Changsong Pang, Peidong Wang, Kris Sankaran, Olawale Onabola, Yimin Liu, Dianbo Liu, Yoshua Bengio

To promote the practical application for autonomous floating wastes cleaning, we present FloW, the first dataset for floating waste detection in inland water areas.

Robust Object Detection

Efficient End-to-End Speech Recognition Using Performers in Conformers

no code implementations9 Nov 2020 Peidong Wang, DeLiang Wang

On-device end-to-end speech recognition poses a high requirement on model efficiency.

Speech Recognition

Multitask Training with Text Data for End-to-End Speech Recognition

no code implementations27 Oct 2020 Peidong Wang, Tara N. Sainath, Ron J. Weiss

We propose a multitask training method for attention-based end-to-end speech recognition models.

Speech Recognition

Speaker Separation Using Speaker Inventories and Estimated Speech

no code implementations20 Oct 2020 Peidong Wang, Zhuo Chen, DeLiang Wang, Jinyu Li, Yifan Gong

We propose speaker separation using speaker inventories and estimated speech (SSUSIES), a framework leveraging speaker profiles and estimated speech for speaker separation.

Speaker Separation Speech Extraction +1

Multi-microphone Complex Spectral Mapping for Utterance-wise and Continuous Speech Separation

2 code implementations4 Oct 2020 Zhong-Qiu Wang, Peidong Wang, DeLiang Wang

Although our system is trained on simulated room impulse responses (RIR) based on a fixed number of microphones arranged in a given geometry, it generalizes well to a real array with the same geometry.

Frame Speaker Separation +1

Bridging the Gap Between Monaural Speech Enhancement and Recognition with Distortion-Independent Acoustic Modeling

no code implementations11 Mar 2019 Peidong Wang, Ke Tan, DeLiang Wang

In this study, we analyze the distortion problem, compare different acoustic models, and investigate a distortion-independent training scheme for monaural speech recognition.

Automatic Speech Recognition Speech Enhancement

Recurrent Deep Stacking Networks for Speech Recognition

no code implementations14 Dec 2016 Peidong Wang, Zhongqiu Wang, DeLiang Wang

This paper presented our work on applying Recurrent Deep Stacking Networks (RDSNs) to Robust Automatic Speech Recognition (ASR) tasks.

Automatic Speech Recognition

Incorporating Language Level Information into Acoustic Models

no code implementations14 Dec 2016 Peidong Wang, DeLiang Wang

This paper proposed a class of novel Deep Recurrent Neural Networks which can incorporate language-level information into acoustic models.

Automatic Speech Recognition

