Search Results for author: Wei-Qiang Zhang

Found 22 papers, 7 papers with code

SpeechColab Leaderboard: An Open-Source Platform for Automatic Speech Recognition Evaluation

1 code implementation • 13 Mar 2024 • Jiayu Du, Jinpeng Li, Guoguo Chen, Wei-Qiang Zhang

In this paper we introduce the SpeechColab Leaderboard, a general-purpose, open-source platform designed for ASR evaluation.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

382

Paper
Code

Transferring speech-generic and depression-specific knowledge for Alzheimer's disease detection

no code implementations • 6 Oct 2023 • Ziyun Cui, Wen Wu, Wei-Qiang Zhang, Ji Wu, Chao Zhang

Apart from the knowledge from speech-generic representations, this paper also proposes to simultaneously transfer the knowledge from a speech depression detection task based on the high comorbidity rates of depression and AD.

Alzheimer's Disease Detection Depression Detection +1

Paper
Add Code

DistilXLSR: A Light Weight Cross-Lingual Speech Representation Model

1 code implementation • 2 Jun 2023 • Haoyu Wang, Siyuan Wang, Wei-Qiang Zhang, Jinfeng Bai

Multilingual self-supervised speech representation models have greatly enhanced the speech recognition performance for low-resource languages, and the compression of these huge models has also become a crucial prerequisite for their industrial application.

speech-recognition Speech Recognition

Paper
Code

Task-Agnostic Structured Pruning of Speech Representation Models

no code implementations • 2 Jun 2023 • Haoyu Wang, Siyuan Wang, Wei-Qiang Zhang, Hongbin Suo, Yulong Wan

Self-supervised pre-trained models such as Wav2vec2, Hubert, and WavLM have been shown to significantly improve many speech tasks.

Model Compression

Paper
Add Code

Improving Speech Translation by Cross-Modal Multi-Grained Contrastive Learning

no code implementations • 20 Apr 2023 • Hao Zhang, Nianwen Si, Yaqi Chen, Wenlin Zhang, Xukui Yang, Dan Qu, Wei-Qiang Zhang

However, the final model often performs worse on the MT task than the MT model trained alone, which means that the knowledge transfer ability of this method is also limited.

Contrastive Learning Machine Translation +3

Paper
Add Code

Unsupervised Anomaly Detection and Localization of Machine Audio: A GAN-based Approach

1 code implementation • 31 Mar 2023 • Anbai Jiang, Wei-Qiang Zhang, Yufeng Deng, Pingyi Fan, Jia Liu

Automatic detection of machine anomaly remains challenging for machine learning.

Denoising Generative Adversarial Network +2

Paper
Code

Cross-lingual Alzheimer's Disease detection based on paralinguistic and pre-trained features

no code implementations • 14 Mar 2023 • Xuchu Chen, Yu Pu, Jinpeng Li, Wei-Qiang Zhang

We present our submission to the ICASSP-SPGC-2023 ADReSS-M Challenge Task, which aims to investigate which acoustic features can be generalized and transferred across languages for Alzheimer's Disease (AD) prediction.

Alzheimer's Disease Detection

Paper
Add Code

MVKT-ECG: Efficient Single-lead ECG Classification on Multi-Label Arrhythmia by Multi-View Knowledge Transferring

no code implementations • 28 Jan 2023 • Yuzhen Qin, Li Sun, Hui Chen, Wei-Qiang Zhang, Wenming Yang, Jintao Fei, Guijin Wang

However, it is challenging to develop a single-lead-based ECG interpretation model for multiple diseases diagnosis due to the lack of some key disease information.

ECG Classification Knowledge Distillation

Paper
Add Code

Expressive Speech-driven Facial Animation with controllable emotions

1 code implementation • 5 Jan 2023 • Yutong Chen, Junhong Zhao, Wei-Qiang Zhang

It is in high demand to generate facial animation with high realism, but it remains a challenging task.

Paper
Code

LMD: A Learnable Mask Network to Detect Adversarial Examples for Speaker Verification

no code implementations • 2 Nov 2022 • Xing Chen, Jie Wang, Xiao-Lei Zhang, Wei-Qiang Zhang, Kunde Yang

It utilizes score variation as an indicator to detect adversarial examples, where the score variation is the absolute discrepancy between the ASV scores of an original audio recording and its transformed audio synthesized from its masked complex spectrogram.

Speaker Verification

Paper
Add Code

Symmetric Saliency-based Adversarial Attack To Speaker Identification

no code implementations • 30 Oct 2022 • Jiadi Yao, Xing Chen, Xiao-Lei Zhang, Wei-Qiang Zhang, Kunde Yang

Adversarial attack approaches to speaker identification either need high computational cost or are not very effective, to our knowledge.

Adversarial Attack Speaker Identification

Paper
Add Code

Exploring Effective Distillation of Self-Supervised Speech Models for Automatic Speech Recognition

no code implementations • 27 Oct 2022 • Yujin Wang, Changli Tang, Ziyang Ma, Zhisheng Zheng, Xie Chen, Wei-Qiang Zhang

Recent years have witnessed great strides in self-supervised learning (SSL) on the speech processing.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Multilingual Zero Resource Speech Recognition Base on Self-Supervise Pre-Trained Acoustic Models

no code implementations • 13 Oct 2022 • Haoyu Wang, Wei-Qiang Zhang, Hongbin Suo, Yulong Wan

Labeled audio data is insufficient to build satisfying speech recognition systems for most of the languages in the world.

Language Modelling speech-recognition +1

Paper
Add Code

Summary on the ISCSLP 2022 Chinese-English Code-Switching ASR Challenge

no code implementations • 12 Oct 2022 • Shuhao Deng, Chengfei Li, Jinfeng Bai, Qingqing Zhang, Wei-Qiang Zhang, Runyan Yang, Gaofeng Cheng, Pengyuan Zhang, Yonghong Yan

Code-switching automatic speech recognition becomes one of the most challenging and the most valuable scenarios of automatic speech recognition, due to the code-switching phenomenon between multilingual language and the frequent occurrence of code-switching phenomenon in daily life.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

The THUEE System Description for the IARPA OpenASR21 Challenge

no code implementations • 29 Jun 2022 • Jing Zhao, Haoyu Wang, Jinpeng Li, Shuzhou Chai, Guan-Bo Wang, Guoguo Chen, Wei-Qiang Zhang

For the Constrained training condition, we construct our basic ASR system based on the standard hybrid architecture.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

BERT-LID: Leveraging BERT to Improve Spoken Language Identification

1 code implementation • 1 Mar 2022 • Yuting Nie, Junhong Zhao, Wei-Qiang Zhang, Jinfeng Bai

It has a profound impact on the multilingual interoperability of an intelligent speech system.

Language Identification Spoken language identification

Paper
Code

Full Attention Bidirectional Deep Learning Structure for Single Channel Speech Enhancement

no code implementations • 27 Aug 2021 • Yuzi Yan, Wei-Qiang Zhang, Michael T. Johnson

As the cornerstone of other important technologies, such as speech recognition and speech synthesis, speech enhancement is a critical area in audio signal processing.

Audio Signal Processing Speech Enhancement +3

Paper
Add Code

AdaSpeech 3: Adaptive Text to Speech for Spontaneous Style

no code implementations • 6 Jul 2021 • Yuzi Yan, Xu Tan, Bohan Li, Guangyan Zhang, Tao Qin, Sheng Zhao, Yuan Shen, Wei-Qiang Zhang, Tie-Yan Liu

While recent text to speech (TTS) models perform very well in synthesizing reading-style (e. g., audiobook) speech, it is still challenging to synthesize spontaneous-style speech (e. g., podcast or conversation), mainly because of two reasons: 1) the lack of training data for spontaneous speech; 2) the difficulty in modeling the filled pauses (um and uh) and diverse rhythms in spontaneous speech.

Paper
Add Code

DeepRapper: Neural Rap Generation with Rhyme and Rhythm Modeling

1 code implementation • ACL 2021 • Lanqing Xue, Kaitao Song, Duocai Wu, Xu Tan, Nevin L. Zhang, Tao Qin, Wei-Qiang Zhang, Tie-Yan Liu

In this paper, we develop DeepRapper, a Transformer-based rap generation system that can model both rhymes and rhythms.

Language Modelling

4,192

Paper
Code

GigaSpeech: An Evolving, Multi-domain ASR Corpus with 10,000 Hours of Transcribed Audio

2 code implementations • 13 Jun 2021 • Guoguo Chen, Shuzhou Chai, Guanbo Wang, Jiayu Du, Wei-Qiang Zhang, Chao Weng, Dan Su, Daniel Povey, Jan Trmal, Junbo Zhang, Mingjie Jin, Sanjeev Khudanpur, Shinji Watanabe, Shuaijiang Zhao, Wei Zou, Xiangang Li, Xuchen Yao, Yongqing Wang, Yujun Wang, Zhao You, Zhiyong Yan

This paper introduces GigaSpeech, an evolving, multi-domain English speech recognition corpus with 10, 000 hours of high quality labeled audio suitable for supervised training, and 40, 000 hours of total audio suitable for semi-supervised and unsupervised training.

Ranked #1 on Speech Recognition on GigaSpeech

Sentence speech-recognition +1

599

Paper
Code

THUEE system description for NIST 2019 SRE CTS Challenge

no code implementations • 25 Dec 2019 • Yi Liu, Tianyu Liang, Can Xu, Xianwei Zhang, Xianhong Chen, Wei-Qiang Zhang, Liang He, Dandan song, Ruyun Li, Yangcheng Wu, Peng Ouyang, Shouyi Yin

This paper describes the systems submitted by the department of electronic engineering, institute of microelectronics of Tsinghua university and TsingMicro Co. Ltd. (THUEE) to the NIST 2019 speaker recognition evaluation CTS challenge.

Speaker Recognition

Paper
Add Code

SAM-GCNN: A Gated Convolutional Neural Network with Segment-Level Attention Mechanism for Home Activity Monitoring

no code implementations • 3 Oct 2018 • Yu-Han Shen, Ke-Xin He, Wei-Qiang Zhang

To tackle this task, we propose a gated convolutional neural network with segment-level attention mechanism (SAM-GCNN).

Home Activity Monitoring

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.