Search Results for author: Shilei Zhang

Found 11 papers, 2 papers with code

Plugin Speech Enhancement: A Universal Speech Enhancement Framework Inspired by Dynamic Neural Network

no code implementations • 20 Feb 2024 • Yanan Chen, Zihao Cui, Yingying Gao, Junlan Feng, Chao Deng, Shilei Zhang

In this study, we present a novel weighting prediction approach, which explicitly learns the task relationships from downstream training information to address the core challenge of universal speech enhancement.

Data Augmentation Speech Enhancement

Paper
Add Code

Cascaded Multi-task Adaptive Learning Based on Neural Architecture Search

no code implementations • 23 Oct 2023 • Yingying Gao, Shilei Zhang, Zihao Cui, Chao Deng, Junlan Feng

Cascading multiple pre-trained models is an effective way to compose an end-to-end system.

Neural Architecture Search

Paper
Add Code

GenDistiller: Distilling Pre-trained Language Models based on Generative Models

no code implementations • 20 Oct 2023 • Yingying Gao, Shilei Zhang, Zihao Cui, Yanhan Xu, Chao Deng, Junlan Feng

Self-supervised pre-trained models such as HuBERT and WavLM leverage unlabeled speech data for representation learning and offer significantly improve for numerous downstream tasks.

Knowledge Distillation Language Modelling +1

Paper
Add Code

MFAS: Emotion Recognition through Multiple Perspectives Fusion Architecture Search Emulating Human Cognition

no code implementations • 12 Jun 2023 • Haiyang Sun, FuLin Zhang, Zheng Lian, Yingying Guo, Shilei Zhang

Additionally, considering that humans adjust their perception of emotional words in textual semantic based on certain cues present in speech, we design a novel search space and search for the optimal fusion strategy for the two types of information.

Quantization Speech Emotion Recognition

Paper
Add Code

Meta Auxiliary Learning for Low-resource Spoken Language Understanding

no code implementations • 26 Jun 2022 • Yingying Gao, Junlan Feng, Chao Deng, Shilei Zhang

Spoken language understanding (SLU) treats automatic speech recognition (ASR) and natural language understanding (NLU) as a unified task and usually suffers from data scarcity.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

A CTC Triggered Siamese Network with Spatial-Temporal Dropout for Speech Recognition

no code implementations • 16 Jun 2022 • Yingying Gao, Junlan Feng, Tianrui Wang, Chao Deng, Shilei Zhang

Analysis shows that our proposed approach brings a better uniformity for the trained model and enlarges the CTC spikes obviously.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Multiple Confidence Gates For Joint Training Of SE And ASR

no code implementations • 1 Apr 2022 • Tianrui Wang, Weibin Zhu, Yingying Gao, Junlan Feng, Shilei Zhang

Joint training of speech enhancement model (SE) and speech recognition model (ASR) is a common solution for robust ASR in noisy environments.

Robust Speech Recognition Speech Enhancement +1

Paper
Add Code

Harmonic gated compensation network plus for ICASSP 2022 DNS CHALLENGE

no code implementations • 25 Feb 2022 • Tianrui Wang, Weibin Zhu, Yingying Gao, Yanan Chen, Junlan Feng, Shilei Zhang

Therefore, we previously proposed a harmonic gated compensation network (HGCN) to predict the full harmonic locations based on the unmasked harmonics and process the result of a coarse enhancement module to recover the masked harmonics.

Paper
Add Code

HGCN: Harmonic gated compensation network for speech enhancement

1 code implementation • 30 Jan 2022 • Tianrui Wang, Weibin Zhu, Yingying Gao, Junlan Feng, Shilei Zhang

Mask processing in the time-frequency (T-F) domain through the neural network has been one of the mainstreams for single-channel speech enhancement.

Action Detection Activity Detection +1

Paper
Code

Identity-Enhanced Network for Facial Expression Recognition

no code implementations • 11 Dec 2018 • Yanwei Li, Xingang Wang, Shilei Zhang, Lingxi Xie, Wenqi Wu, Hongyuan Yu, Zheng Zhu

Facial expression recognition is a challenging task, arguably because of large intra-class variations and high inter-class similarities.

Facial Expression Recognition Facial Expression Recognition (FER) +1

Paper
Add Code

Autoencoder Regularized Network For Driving Style Representation Learning

1 code implementation • 5 Jan 2017 • Weishan Dong, Ting Yuan, Kai Yang, Changsheng Li, Shilei Zhang

In this paper, we study learning generalized driving style representations from automobile GPS trip data.

Driver Identification Representation Learning

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.