SpeechEQ: Speech Emotion Recognition based on Multi-scale Unified Datasets and Multitask Learning

no code implementations27 Jun 2022 Zuheng Kang, Junqing Peng, Jianzong Wang, Jing Xiao

Speech emotion recognition (SER) has many challenges, but one of the main challenges is that each framework does not have a unified standard.

Speech Emotion Recognition

A Privacy-Preserving Subgraph-Level Federated Graph Neural Network via Differential Privacy

no code implementations7 Jun 2022 Yeqing Qiu, Chenyu Huang, Jianzong Wang, Zhangcheng Huang, Jing Xiao

Currently, the federated graph neural network (GNN) has attracted a lot of attention due to its wide applications in reality without violating the privacy regulations.

Privacy Preserving

Micro-Expression Recognition Based on Attribute Information Embedding and Cross-modal Contrastive Learning

no code implementations29 May 2022 Yanxin Song, Jianzong Wang, Tianbo Wu, Zhangcheng Huang, Jing Xiao

Micro-expressions have the characteristics of short duration and low intensity, and it is difficult to train a high-performance classifier with the limited number of existing micro-expressions.

Contrastive Learning Micro-Expression Recognition

Adaptive Activation Network For Low Resource Multilingual Speech Recognition

no code implementations28 May 2022 Jian Luo, Jianzong Wang, Ning Cheng, Zhenpeng Zheng, Jing Xiao

The existing models mostly established a bottleneck (BN) layer by pre-training on a large source language, and transferring to the low resource target language.

Automatic Speech Recognition speech-recognition

Speech Augmentation Based Unsupervised Learning for Keyword Spotting

no code implementations28 May 2022 Jian Luo, Jianzong Wang, Ning Cheng, Haobin Tang, Jing Xiao

In our experiments, with augmentation based unsupervised learning, our KWS model achieves better performance than other unsupervised methods, such as CPC, APC, and MPC.

Keyword Spotting

QSpeech: Low-Qubit Quantum Speech Application Toolkit

1 code implementation26 May 2022 Zhenhou Hong, Jianzong Wang, Xiaoyang Qu, Chendong Zhao, Wei Tao, Jing Xiao

However, Quantum Neural Network (QNN) running on low-qubit quantum devices would be difficult since it is based on Variational Quantum Circuit (VQC), which requires many qubits.

DT-SV: A Transformer-based Time-domain Approach for Speaker Verification

no code implementations26 May 2022 Nan Zhang, Jianzong Wang, Zhenhou Hong, Chendong Zhao, Xiaoyang Qu, Jing Xiao

Therefore, we propose an approach to derive utterance-level speaker embeddings via a Transformer architecture that uses a novel loss function named diffluence loss to integrate the feature information of different Transformer layers.

Speaker Verification

A Fair Federated Learning Framework With Reinforcement Learning

no code implementations26 May 2022 Yaqi Sun, Shijing Si, Jianzong Wang, Yuhan Dong, Zhitao Zhu, Jing Xiao

More importantly, we apply the Gini coefficient and validation accuracy of clients in each communication round to construct a reward function for the reinforcement learning.

Fairness Federated Learning +1

Federated Split BERT for Heterogeneous Text Classification

no code implementations26 May 2022 Zhengyang Li, Shijing Si, Jianzong Wang, Jing Xiao

To address this issue, we propose a framework, FedSplitBERT, which handles heterogeneous data and decreases the communication cost by splitting the BERT encoder layers into local part and global part.

Classification Federated Learning +3

Federated Non-negative Matrix Factorization for Short Texts Topic Modeling with Mutual Information

no code implementations26 May 2022 Shijing Si, Jianzong Wang, Ruiyi Zhang, Qinliang Su, Jing Xiao

Non-negative matrix factorization (NMF) based topic modeling is widely used in natural language processing (NLP) to uncover hidden topics of short text documents.

Federated Learning Natural Language Processing +1

Leveraging Causal Inference for Explainable Automatic Program Repair

no code implementations26 May 2022 Jianzong Wang, Shijing Si, Zhitao Zhu, Xiaoyang Qu, Zhenhou Hong, Jing Xiao

The experiments on four programming languages (Java, C, Python, and JavaScript) show that CPR can generate causal graphs for reasonable interpretations and boost the performance of bug fixing in automatic program repair.

Causal Inference Data Augmentation +2

Cali3F: Calibrated Fast Fair Federated Recommendation System

no code implementations26 May 2022 Zhitao Zhu, Shijing Si, Jianzong Wang, Jing Xiao

Specific to recommendation systems, many federated recommendation algorithms have been proposed to realize the privacy-preserving collaborative recommendation.

Fairness Federated Learning +2

Augmentation-induced Consistency Regularization for Classification

no code implementations25 May 2022 Jianhan Wu, Shijing Si, Jianzong Wang, Jing Xiao

In this paper, we propose a consistency regularization framework based on data augmentation, called CR-Aug, which forces the output distributions of different sub models generated by data augmentation to be consistent with each other.

Audio Classification Classification +1

Adaptive Few-Shot Learning Algorithm for Rare Sound Event Detection

no code implementations24 May 2022 Chendong Zhao, Jianzong Wang, Leilai Li, Xiaoyang Qu, Jing Xiao

In this work, we propose a novel task-adaptive module which is easy to plant into any metric-based few-shot learning frameworks.

Event Detection Few-Shot Learning +1

Self-Attention for Incomplete Utterance Rewriting

no code implementations24 Feb 2022 Yong Zhang, Zhitao Li, Jianzong Wang, Ning Cheng, Jing Xiao

In this paper, we propose a novel method by directly extracting the coreference and omission relationship from the self-attention weight matrix of the transformer instead of word embeddings and edit the original text accordingly to generate the complete utterance.

Word Embeddings

Towards Speaker Age Estimation with Label Distribution Learning

no code implementations23 Feb 2022 Shijing Si, Jianzong Wang, Junqing Peng, Jing Xiao

To address this, we utilize the ambiguous information among the age labels, convert each age label into a discrete label distribution and leverage the label distribution learning (LDL) method to fit the data.

Age Estimation Multi-class Classification

VU-BERT: A Unified framework for Visual Dialog

no code implementations22 Feb 2022 Tong Ye, Shijing Si, Jianzong Wang, Rui Wang, Ning Cheng, Jing Xiao

The visual dialog task attempts to train an agent to answer multi-turn questions given an image, which requires the deep understanding of interactions between the image and dialog history.

Language Modelling Masked Language Modeling +1

ClsVC: Learning Speech Representations with two different classification tasks.

no code implementations29 Sep 2021 Tang huaizhen, xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao

Voice conversion(VC) aims to convert one speaker's voice to generate a new speech as it is said by another speaker.

Classification Voice Conversion

Federated Learning with Dynamic Transformer for Text to Speech

no code implementations9 Jul 2021 Zhenhou Hong, Jianzong Wang, Xiaoyang Qu, Jie Liu, Chendong Zhao, Jing Xiao

Text to speech (TTS) is a crucial task for user interaction, but TTS model training relies on a sizable set of high-quality original datasets.

Federated Learning

Loss Prediction: End-to-End Active Learning Approach For Speech Recognition

no code implementations9 Jul 2021 Jian Luo, Jianzong Wang, Ning Cheng, Jing Xiao

End-to-end speech recognition systems usually require huge amounts of labeling resource, while annotating the speech data is complicated and expensive.

Active Learning Automatic Speech Recognition +1

Efficient Client Contribution Evaluation for Horizontal Federated Learning

no code implementations26 Feb 2021 Jie Zhao, Xinghua Zhu, Jianzong Wang, Jing Xiao

In this paper an efficient method is proposed to evaluate the contributions of federated participants.

Federated Learning

Enhancing Data-Free Adversarial Distillation with Activation Regularization and Virtual Interpolation

no code implementations23 Feb 2021 Xiaoyang Qu, Jianzong Wang, Jing Xiao

We add an activation regularizer and a virtual interpolation method to improve the data generation efficiency.

Knowledge Distillation

MelGlow: Efficient Waveform Generative Network Based on Location-Variable Convolution

4 code implementations3 Dec 2020 Zhen Zeng, Jianzong Wang, Ning Cheng, Jing Xiao

In this paper, an efficient network, named location-variable convolution, is proposed to model the dependencies of waveforms.

Large-scale Transfer Learning for Low-resource Spoken Language Understanding

no code implementations13 Aug 2020 Xueli Jia, Jianzong Wang, Zhiyong Zhang, Ning Cheng, Jing Xiao

However, the increased complexity of a model can also introduce high risk of over-fitting, which is a major challenge in SLU tasks due to the limitation of available data.

Automatic Speech Recognition speech-recognition +2

Prosody Learning Mechanism for Speech Synthesis System Without Text Length Limit

no code implementations13 Aug 2020 Zhen Zeng, Jianzong Wang, Ning Cheng, Jing Xiao

Recent neural speech synthesis systems have gradually focused on the control of prosody to improve the quality of synthesized speech, but they rarely consider the variability of prosody and the correlation between prosody and semantics together.

Language Modelling Prosody Prediction +1

MLNET: An Adaptive Multiple Receptive-field Attention Neural Network for Voice Activity Detection

no code implementations13 Aug 2020 Zhenpeng Zheng, Jianzong Wang, Ning Cheng, Jian Luo, Jing Xiao

The MLNET leveraged multi-branches to extract multiple contextual speech information and investigated an effective attention block to weight the most crucial parts of the context for final classification.

Action Detection Activity Detection

MDCNN-SID: Multi-scale Dilated Convolution Network for Singer Identification

no code implementations9 Apr 2020 xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao

Most singer identification methods are processed in the frequency domain, which potentially leads to information loss during the spectral transformation.

Artist classification Music Generation +1

GraphTTS: graph-to-sequence modelling in neural text-to-speech

no code implementations4 Mar 2020 Aolan Sun, Jianzong Wang, Ning Cheng, Huayi Peng, Zhen Zeng, Jing Xiao

This paper leverages the graph-to-sequence method in neural text-to-speech (GraphTTS), which maps the graph embedding of the input sequence to spectrograms.

Graph Embedding Graph-to-Sequence +1

AlignTTS: Efficient Feed-Forward Text-to-Speech System without Explicit Alignment

2 code implementations4 Mar 2020 Zhen Zeng, Jianzong Wang, Ning Cheng, Tian Xia, Jing Xiao

Targeting at both high efficiency and performance, we propose AlignTTS to predict the mel-spectrum in parallel.

A Robust Speaker Clustering Method Based on Discrete Tied Variational Autoencoder

no code implementations4 Mar 2020 Chen Feng, Jianzong Wang, Tongxu Li, Junqing Peng, Jing Xiao

Recently, the speaker clustering model based on aggregation hierarchy cluster (AHC) is a common method to solve two main problems: no preset category number clustering and fix category number clustering.

Dynamic Student Classiffication on Memory Networks for Knowledge Tracing

1 code implementation22 Mar 2019 Sein Minn, Michel C. Desmarais, Feida Zhu, Jing Xiao, Jianzong Wang

Knowledge Tracing (KT) is the assessment of student’s knowledge state and predicting whether that student may or may not answer the next problem correctly based on a number of previous practices and outcomes in their learning process.

Knowledge Tracing

