no code implementations • NAACL (AutoSimTrans) 2021 • Linjie Chen, Jianzong Wang, Zhangcheng Huang, Xiongbin Ding, Jing Xiao
This paper shows our submission on the second automatic simultaneous translation workshop at NAACL2021.
no code implementations • 18 Jun 2024 • Haoyan Yang, Zhitao Li, Yong Zhang, Jianzong Wang, Ning Cheng, Ming Li, Jing Xiao
Our framework was designed to be communication efficient, computation can be delegated to the local client so that the server's computation burden can be lightened.
no code implementations • 28 May 2024 • Jianzong Wang, Haoxiang Shi, Kaiyi Luo, xulong Zhang, Ning Cheng, Jing Xiao
For unpaired data, to effectively capture the latent discriminative features, the high-order relationships between unpaired data and anchors are embedded into the latent subspace, which are computed by efficient linear reconstruction.
no code implementations • 28 May 2024 • Haoxiang Shi, xulong Zhang, Ning Cheng, Yong Zhang, Jun Yu, Jing Xiao, Jianzong Wang
Previous ERC methods relied on simple connections for cross-modal fusion and ignored the information differences between modalities, resulting in the model being unable to focus on modality-specific emotional information.
no code implementations • 22 May 2024 • Zhiyuan Wang, Bokui Chen, Xiaoyang Qu, Zhenhou Hong, Jing Xiao, Jianzong Wang
Our findings underscore the efficacy of the FSDT framework in effectively leveraging distributed offline reinforcement learning data to enable powerful multi-type agent decision systems.
no code implementations • 11 May 2024 • Shenglin He, Xiaoyang Qu, Jiguang Wan, Guokuan Li, Changsheng Xie, Jianzong Wang
To address this problem, we propose a Plane-Fit Redundancy Encoding point cloud sequence network named PRENet.
no code implementations • 30 Apr 2024 • Sheng Ouyang, Jianzong Wang, Yong Zhang, Zhitao Li, ZiQi Liang, xulong Zhang, Ning Cheng, Jing Xiao
Extractive Question Answering (EQA) in Machine Reading Comprehension (MRC) often faces the challenge of dealing with semantically identical but format-variant inputs.
Extractive Question-Answering Machine Reading Comprehension +1
no code implementations • 24 Apr 2024 • Zuheng Kang, Yayun He, Jianzong Wang, Junqing Peng, Jing Xiao
Single-model systems often suffer from deficiencies in tasks such as speaker verification (SV) and image classification, relying heavily on partial prior knowledge during decision-making, resulting in suboptimal performance.
no code implementations • 22 Apr 2024 • Zuheng Kang, Yayun He, Botao Zhao, Xiaoyang Qu, Junqing Peng, Jing Xiao, Jianzong Wang
With recent advances in speech synthesis including text-to-speech (TTS) and voice conversion (VC) systems enabling the generation of ultra-realistic audio deepfakes, there is growing concern about their potential misuse.
no code implementations • 8 Mar 2024 • Jianzong Wang, Pengcheng Li, xulong Zhang, Ning Cheng, Jing Xiao
After combining the intent from two domains into a joint representation, the integrated intent representation is fed into a decision layer for classification.
1 code implementation • 1 Feb 2024 • Ming Li, Yong Zhang, Shwai He, Zhitao Li, Hongyu Zhao, Jianzong Wang, Ning Cheng, Tianyi Zhou
Data filtering for instruction tuning has proved important in improving both the efficiency and performance of the tuning process.
no code implementations • 24 Jan 2024 • Wei Tao, Shenglin He, Kai Lu, Xiaoyang Qu, Guokuan Li, Jiguang Wan, Jianzong Wang, Jing Xiao
In addition, for patches without outlier values, we utilize value-driven quantization search (VDQS) on the feature maps of their following dataflow branches to reduce search time.
no code implementations • 22 Jan 2024 • Zhiyuan Wang, Xiaoyang Qu, Jing Xiao, Bokui Chen, Jianzong Wang
Catastrophic forgetting poses a substantial challenge for managing intelligent agents controlled by a large model, causing performance degradation when these agents face new tasks.
no code implementations • 22 Jan 2024 • Zhiyuan Wang, Xiaoyang Qu, Jing Xiao, Bokui Chen, Jianzong Wang
This paper introduces INCPrompt, an innovative continual learning solution that effectively addresses catastrophic forgetting.
no code implementations • 18 Jan 2024 • Yong Zhang, Hanzhang Li, Zhitao Li, Ning Cheng, Ming Li, Jing Xiao, Jianzong Wang
Large Language Models (LLMs) have shown significant promise in various applications, including zero-shot and few-shot learning.
no code implementations • 16 Jan 2024 • Haobin Tang, xulong Zhang, Ning Cheng, Jing Xiao, Jianzong Wang
We introduce ED-TTS, a multi-scale emotional speech synthesis model that leverages Speech Emotion Diarization (SED) and Speech Emotion Recognition (SER) to model emotions at different levels.
no code implementations • 16 Jan 2024 • Bingyuan Zhang, xulong Zhang, Ning Cheng, Jun Yu, Jing Xiao, Jianzong Wang
In recent years, the field of talking faces generation has attracted considerable attention, with certain methods adept at generating virtual faces that convincingly imitate human expressions.
1 code implementation • NeurIPS 2023 • Jinggang Chen, Junjie Li, Xiaoyang Qu, Jianzong Wang, Jiguang Wan, Jing Xiao
This perspective is motivated by our observation that gradient-based attribution methods encounter challenges in assigning feature importance to OOD data, thereby yielding divergent explanation patterns.
no code implementations • 15 Nov 2023 • Jianzong Wang, Yimin Deng, ZiQi Liang, xulong Zhang, Ning Cheng, Jing Xiao
This paper proposes a talking face generation method named "CP-EB" that takes an audio signal as input and a person image as reference, to synthesize a photo-realistic people talking video with head poses controlled by a short video clip and proper eye blinking embedding.
no code implementations • 23 Oct 2023 • Haoyan Yang, Zhitao Li, Yong Zhang, Jianzong Wang, Ning Cheng, Ming Li, Jing Xiao
The Retrieval Question Answering (ReQA) task employs the retrieval-augmented framework, composed of a retriever and generator.
no code implementations • 7 Oct 2023 • Yayun He, Zuheng Kang, Jianzong Wang, Junqing Peng, Jing Xiao
Speaker verification (SV) performance deteriorates as utterances become shorter.
no code implementations • 16 Sep 2023 • Yazhong Si, xulong Zhang, Fan Yang, Jianzong Wang, Ning Cheng, Jing Xiao
Most existing sandstorm image enhancement methods are based on traditional theory and prior knowledge, which often restrict their applicability in real-world scenarios.
no code implementations • 14 Sep 2023 • Zipeng Qi, xulong Zhang, Ning Cheng, Jing Xiao, Jianzong Wang
Generating realistic talking faces is a complex and widely discussed task with numerous applications.
no code implementations • 28 Aug 2023 • xulong Zhang, Jianzong Wang, Ning Cheng, Yifu Sun, Chuanyao Zhang, Jing Xiao
The rise of the phenomenon of the "right to be forgotten" has prompted research on machine unlearning, which grants data owners the right to actively withdraw data that has been used for model training, and requires the elimination of the contribution of that data to the model.
2 code implementations • 23 Aug 2023 • Ming Li, Yong Zhang, Zhitao Li, Jiuhai Chen, Lichang Chen, Ning Cheng, Jianzong Wang, Tianyi Zhou, Jing Xiao
In the realm of Large Language Models (LLMs), the balance between instruction data quality and quantity is a focal point.
no code implementations • 17 Aug 2023 • Liang Wang, Nan Zhang, Xiaoyang Qu, Jianzong Wang, Jiguang Wan, Guokuan Li, Kaiyu Hu, Guilin Jiang, Jing Xiao
In this paper, we introduce EdgeMA, a practical and efficient video analytics system designed to adapt models to shifts in real-world video streams over time, addressing the data drift problem.
no code implementations • 7 Aug 2023 • Yong Zhang, Zhitao Li, Jianzong Wang, Yiming Gao, Ning Cheng, Fengying Yu, Jing Xiao
Conversational Question Answering (CQA) is a challenging task that aims to generate natural answers for conversational flow questions.
no code implementations • 7 Aug 2023 • Jiaxin Fan, Yong Zhang, Hanzhang Li, Jianzong Wang, Zhitao Li, Sheng Ouyang, Ning Cheng, Jing Xiao
Chinese Automatic Speech Recognition (ASR) error correction presents significant challenges due to the Chinese language's unique features, including a large character set and borderless, morpheme-based structure.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 27 Jun 2023 • Liang Wang, Kai Lu, Nan Zhang, Xiaoyang Qu, Jianzong Wang, Jiguang Wan, Guokuan Li, Jing Xiao
This paper proposes Shoggoth, an efficient edge-cloud collaborative architecture, for boosting inference performance on real-time video of changing scenes.
no code implementations • 27 Jun 2023 • Chenghao Liu, Xiaoyang Qu, Jianzong Wang, Jing Xiao
To address local forgetting caused by new classes of new tasks and global forgetting brought by non-i. i. d (non-independent and identically distributed) class imbalance across different local clients, we proposed an Enhancer distillation method to modify the imbalance between old and new knowledge and repair the non-i. i. d.
no code implementations • 31 May 2023 • Zuheng Kang, Jianzong Wang, Junqing Peng, Jing Xiao
To address this, we propose a speaker verification-based voice activity detection (SVVAD) framework that can adapt the speech features according to which are most informative for SV.
no code implementations • 23 Apr 2023 • Rongfeng Pan, Jianzong Wang, Lingwei Kong, Zhangcheng Huang, Jing Xiao
To eliminate this concern, we propose a federated learning text summarization scheme, which allows users to share the global model in a cooperative learning manner without sharing raw data.
no code implementations • 17 Mar 2023 • Jinggang Chen, Xiaoyang Qu, Junjie Li, Jianzong Wang, Jiguang Wan, Jing Xiao
Out-of-distribution (OOD) detection aims at enhancing standard deep neural networks to distinguish anomalous inputs from original training data.
no code implementations • 15 Mar 2023 • Tong Ye, Zhitao Li, Jianzong Wang, Ning Cheng, Jing Xiao
Deep neural networks have achieved remarkable performance in retrieval-based dialogue systems, but they are shown to be ill calibrated.
no code implementations • 15 Mar 2023 • Tong Ye, Shijing Si, Jianzong Wang, Ning Cheng, Zhitao Li, Jing Xiao
Deep neural retrieval models have amply demonstrated their power but estimating the reliability of their predictions remains challenging.
no code implementations • 14 Mar 2023 • Haobin Tang, xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao
Recent expressive text to speech (TTS) models focus on synthesizing emotional speech, but some fine-grained styles such as intonation are neglected.
no code implementations • 14 Mar 2023 • xulong Zhang, Haobin Tang, Jianzong Wang, Ning Cheng, Jian Luo, Jing Xiao
Because of predicting all the target tokens in parallel, the non-autoregressive models greatly improve the decoding efficiency of speech recognition compared with traditional autoregressive models.
no code implementations • 14 Mar 2023 • Zuheng Kang, Yayun He, Jianzong Wang, Junqing Peng, Xiaoyang Qu, Jing Xiao
Data-Free Knowledge Distillation (DFKD) has recently attracted growing attention in the academic community, especially with major breakthroughs in computer vision.
no code implementations • 14 Mar 2023 • Kexin Zhu, xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao
Using deep learning methods to classify EEG signals can accurately identify people's emotions.
no code implementations • 25 Oct 2022 • xulong Zhang, Jianzong Wang, Ning Cheng, Mengyuan Zhao, Zhiyong Zhang, Jing Xiao
We also find that in joint CTC-Attention ASR model, decoder is more sensitive to linguistic information than acoustic information.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 25 Oct 2022 • xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao
In this paper, we proposed Adapitch, a multi-speaker TTS method that makes adaptation of the supervised module with untranscribed data.
no code implementations • 25 Oct 2022 • xulong Zhang, Jianzong Wang, Ning Cheng, Kexin Zhu, Jing Xiao
In this work, we proposed two kinds of masking approaches: (1) speech-level masking, making the model to mask more speech segments than silence segments, (2) phoneme-level masking, forcing the model to mask the whole frames of the phoneme, instead of phoneme pieces.
no code implementations • 25 Oct 2022 • xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao
Metaverse expands the physical world to a new dimension, and the physical environment and Metaverse environment can be directly connected and entered.
no code implementations • 25 Oct 2022 • xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao
Recent advances in pre-trained language models have improved the performance for text classification tasks.
no code implementations • 25 Oct 2022 • xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao
Most previous neural text-to-speech (TTS) methods are mainly based on supervised learning methods, which means they depend on a large training dataset and hard to achieve comparable performance under low-resource conditions.
no code implementations • 18 Oct 2022 • Zuheng Kang, Jianzong Wang, Junqing Peng, Jing Xiao
Estimating age from a single speech is a classic and challenging topic.
no code implementations • 15 Oct 2022 • Chendong Zhao, Jianzong Wang, Xiaoyang Qu, Haoqian Wang, Jing Xiao
Unsupervised representation learning for speech audios attained impressive performances for speech recognition tasks, particularly when annotated speech is limited.
no code implementations • 13 Oct 2022 • Aolan Sun, xulong Zhang, Tiandong Ling, Jianzong Wang, Ning Cheng, Jing Xiao
Since the beginning of the COVID-19 pandemic, remote conferencing and school-teaching have become important tools.
no code implementations • 7 Oct 2022 • Jianhan Wu, Jianzong Wang, Shijing Si, Xiaoyang Qu, Jing Xiao
Most existing methods encode the texture of the whole reference human image into a latent space, and then utilize a decoder to synthesize the image texture of the target pose.
no code implementations • 30 Sep 2022 • Zihao Cao, Jianzong Wang, Shijing Si, Zhangcheng Huang, Jing Xiao
Even when data is removed from the dataset, the effects of these data persist in the model.
no code implementations • 30 Sep 2022 • Denghao Li, Yuqiao Zeng, Jianzong Wang, Lingwei Kong, Zhangcheng Huang, Ning Cheng, Xiaoyang Qu, Jing Xiao
Buddhism is an influential religion with a long-standing history and profound philosophy.
no code implementations • 30 Sep 2022 • Chendong Zhao, Jianzong Wang, Wen qi Wei, Xiaoyang Qu, Haoqian Wang, Jing Xiao
For multi-head attention in Transformer ASR, it is not easy to model monotonic alignments in different heads.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 30 Sep 2022 • Wen Wang, Jianzong Wang, Shijing Si, Zhangcheng Huang, Jing Xiao
The extraction of sequence patterns from a collection of functionally linked unlabeled DNA sequences is known as DNA motif discovery, and it is a key task in computational biology.
no code implementations • 21 Sep 2022 • Shijing Si, Jianzong Wang, xulong Zhang, Xiaoyang Qu, Ning Cheng, Jing Xiao
Nonparallel multi-domain voice conversion methods such as the StarGAN-VCs have been widely applied in many scenarios.
no code implementations • 24 Aug 2022 • Zhitao Zhu, Shijing Si, Jianzong Wang, Yaodong Yang, Jing Xiao
Deep neural networks can capture the intricate interaction history information between queries and documents, because of their many complicated nonlinear units, allowing them to provide correct search recommendations.
1 code implementation • 18 Aug 2022 • Sicheng Yang, Methawee Tantrawenith, Haolin Zhuang, Zhiyong Wu, Aolan Sun, Jianzong Wang, Ning Cheng, Huaizhen Tang, Xintao Zhao, Jie Wang, Helen Meng
One-shot voice conversion (VC) with only a single target speaker's speech for reference has become a hot research topic.
no code implementations • 8 Aug 2022 • Huaizhen Tang, xulong Zhang, Jianzong Wang, Ning Cheng, Zhen Zeng, Edward Xiao, Jing Xiao
In this paper, a novel voice conversion framework, named $\boldsymbol T$ext $\boldsymbol G$uided $\boldsymbol A$utoVC(TGAVC), is proposed to more effectively separate content and timbre from speech, where an expected content embedding produced based on the text transcriptions is designed to guide the extraction of voice content.
1 code implementation • 27 Jun 2022 • Tong Ye, Shijing Si, Jianzong Wang, Ning Cheng, Jing Xiao
In this work, we investigate the uncertainty calibration for deep audio classifiers.
no code implementations • 27 Jun 2022 • Zuheng Kang, Junqing Peng, Jianzong Wang, Jing Xiao
Speech emotion recognition (SER) has many challenges, but one of the main challenges is that each framework does not have a unified standard.
no code implementations • 7 Jun 2022 • Yeqing Qiu, Chenyu Huang, Jianzong Wang, Zhangcheng Huang, Jing Xiao
Currently, the federated graph neural network (GNN) has attracted a lot of attention due to its wide applications in reality without violating the privacy regulations.
no code implementations • 29 May 2022 • Yanxin Song, Jianzong Wang, Tianbo Wu, Zhangcheng Huang, Jing Xiao
Micro-expressions have the characteristics of short duration and low intensity, and it is difficult to train a high-performance classifier with the limited number of existing micro-expressions.
no code implementations • 28 May 2022 • Jian Luo, Jianzong Wang, Ning Cheng, Zhenpeng Zheng, Jing Xiao
The existing models mostly established a bottleneck (BN) layer by pre-training on a large source language, and transferring to the low resource target language.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 28 May 2022 • Jian Luo, Jianzong Wang, Ning Cheng, Haobin Tang, Jing Xiao
In our experiments, with augmentation based unsupervised learning, our KWS model achieves better performance than other unsupervised methods, such as CPC, APC, and MPC.
no code implementations • 26 May 2022 • Nan Zhang, Jianzong Wang, Zhenhou Hong, Chendong Zhao, Xiaoyang Qu, Jing Xiao
Therefore, we propose an approach to derive utterance-level speaker embeddings via a Transformer architecture that uses a novel loss function named diffluence loss to integrate the feature information of different Transformer layers.
no code implementations • 26 May 2022 • Jianzong Wang, Shijing Si, Zhitao Zhu, Xiaoyang Qu, Zhenhou Hong, Jing Xiao
The experiments on four programming languages (Java, C, Python, and JavaScript) show that CPR can generate causal graphs for reasonable interpretations and boost the performance of bug fixing in automatic program repair.
no code implementations • 26 May 2022 • Shijing Si, Jianzong Wang, Ruiyi Zhang, Qinliang Su, Jing Xiao
Non-negative matrix factorization (NMF) based topic modeling is widely used in natural language processing (NLP) to uncover hidden topics of short text documents.
no code implementations • 26 May 2022 • Zhengyang Li, Shijing Si, Jianzong Wang, Jing Xiao
To address this issue, we propose a framework, FedSplitBERT, which handles heterogeneous data and decreases the communication cost by splitting the BERT encoder layers into local part and global part.
no code implementations • 26 May 2022 • Yaqi Sun, Shijing Si, Jianzong Wang, Yuhan Dong, Zhitao Zhu, Jing Xiao
More importantly, we apply the Gini coefficient and validation accuracy of clients in each communication round to construct a reward function for the reinforcement learning.
1 code implementation • 26 May 2022 • Zhenhou Hong, Jianzong Wang, Xiaoyang Qu, Chendong Zhao, Wei Tao, Jing Xiao
However, Quantum Neural Network (QNN) running on low-qubit quantum devices would be difficult since it is based on Variational Quantum Circuit (VQC), which requires many qubits.
no code implementations • 26 May 2022 • Zhitao Zhu, Shijing Si, Jianzong Wang, Jing Xiao
Specific to recommendation systems, many federated recommendation algorithms have been proposed to realize the privacy-preserving collaborative recommendation.
no code implementations • 25 May 2022 • Jianhan Wu, Shijing Si, Jianzong Wang, Jing Xiao
In this paper, we propose a consistency regularization framework based on data augmentation, called CR-Aug, which forces the output distributions of different sub models generated by data augmentation to be consistent with each other.
no code implementations • 24 May 2022 • Chendong Zhao, Jianzong Wang, Leilai Li, Xiaoyang Qu, Jing Xiao
In this work, we propose a novel task-adaptive module which is easy to plant into any metric-based few-shot learning frameworks.
no code implementations • 24 May 2022 • Jianhan Wu, Shijing Si, Jianzong Wang, Jing Xiao
And the second is that the training of GAN is unstable and slow to converge, such as model collapse.
no code implementations • 24 Feb 2022 • Yong Zhang, Zhitao Li, Jianzong Wang, Ning Cheng, Jing Xiao
In this paper, we propose a novel method by directly extracting the coreference and omission relationship from the self-attention weight matrix of the transformer instead of word embeddings and edit the original text accordingly to generate the complete utterance.
no code implementations • 23 Feb 2022 • Shijing Si, Jianzong Wang, Junqing Peng, Jing Xiao
To address this, we utilize the ambiguous information among the age labels, convert each age label into a discrete label distribution and leverage the label distribution learning (LDL) method to fit the data.
no code implementations • 22 Feb 2022 • Tong Ye, Shijing Si, Jianzong Wang, Rui Wang, Ning Cheng, Jing Xiao
The visual dialog task attempts to train an agent to answer multi-turn questions given an image, which requires the deep understanding of interactions between the image and dialog history.
no code implementations • 21 Feb 2022 • Chendong Zhao, Jianzong Wang, Xiaoyang Qu, Haoqian Wang, Jing Xiao
In this paper, we aim to evaluate and enhance the robustness of G2P models.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 29 Sep 2021 • Tang huaizhen, xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao
Voice conversion(VC) aims to convert one speaker's voice to generate a new speech as it is said by another speaker.
no code implementations • 10 Jul 2021 • Shijing Si, Jianzong Wang, Xiaoyang Qu, Ning Cheng, Wenqi Wei, Xinghua Zhu, Jing Xiao
This paper investigates a novel task of talking face video generation solely from speeches.
no code implementations • 9 Jul 2021 • Jian Luo, Jianzong Wang, Ning Cheng, Jing Xiao
End-to-end speech recognition systems usually require huge amounts of labeling resource, while annotating the speech data is complicated and expensive.
no code implementations • 9 Jul 2021 • Zhenhou Hong, Jianzong Wang, Xiaoyang Qu, Jie Liu, Chendong Zhao, Jing Xiao
Text to speech (TTS) is a crucial task for user interaction, but TTS model training relies on a sizable set of high-quality original datasets.
no code implementations • 9 Jul 2021 • Jian Luo, Jianzong Wang, Ning Cheng, Jing Xiao
We evaluated the proposed methods on phoneme classification and speaker recognition tasks.
no code implementations • 26 Feb 2021 • Jie Zhao, Xinghua Zhu, Jianzong Wang, Jing Xiao
In this paper an efficient method is proposed to evaluate the contributions of federated participants.
no code implementations • 24 Feb 2021 • Yong liu, Xinghua Zhu, Jianzong Wang, Jing Xiao
In addition, using the proposed metric, we investigate the influential factors of risk level.
no code implementations • 23 Feb 2021 • Xiaoyang Qu, Jianzong Wang, Jing Xiao
We add an activation regularizer and a virtual interpolation method to improve the data generation efficiency.
no code implementations • 23 Feb 2021 • Jian Luo, Jianzong Wang, Ning Cheng, Jing Xiao
We propose a novel network structure, called Memory-Self-Attention (MSA) Transducer.
no code implementations • 22 Feb 2021 • Yanfei Hui, Jianzong Wang, Ning Cheng, Fengying Yu, Tianbo Wu, Jing Xiao
Slot filling and intent detection have become a significant theme in the field of natural language understanding.
no code implementations • 3 Dec 2020 • Aolan Sun, Jianzong Wang, Ning Cheng, Huayi Peng, Zhen Zeng, Lingwei Kong, Jing Xiao
Graph-to-sequence model is proposed and formed by a graph encoder and an attentional decoder.
3 code implementations • 3 Dec 2020 • Zhen Zeng, Jianzong Wang, Ning Cheng, Jing Xiao
In this paper, an efficient network, named location-variable convolution, is proposed to model the dependencies of waveforms.
no code implementations • Findings of the Association for Computational Linguistics 2020 • Xinghua Zhu, Jianzong Wang, Zhenhou Hong, Jing Xiao
It is also found that the FL models are sensitive to data load balancedness among client datasets.
no code implementations • 16 Sep 2020 • Anxun He, Jianzong Wang, Zhangcheng Huang, Jing Xiao
Federated learning has made an important contribution to data privacy-preserving.
no code implementations • 18 Aug 2020 • Wenqi Wei, Jianzong Wang, Jiteng Ma, Ning Cheng, Jing Xiao
The structure of our model are maintained concise to be implemented for real-time applications.
no code implementations • 13 Aug 2020 • Xiaoyang Qu, Jianzong Wang, Jing Xiao
We borrow the idea of neural architecture search(NAS) for the textindependent speaker verification task.
Neural Architecture Search Text-Independent Speaker Verification
no code implementations • 13 Aug 2020 • Xueli Jia, Jianzong Wang, Zhiyong Zhang, Ning Cheng, Jing Xiao
However, the increased complexity of a model can also introduce high risk of over-fitting, which is a major challenge in SLU tasks due to the limitation of available data.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +3
no code implementations • 13 Aug 2020 • Zhenpeng Zheng, Jianzong Wang, Ning Cheng, Jian Luo, Jing Xiao
The MLNET leveraged multi-branches to extract multiple contextual speech information and investigated an effective attention block to weight the most crucial parts of the context for final classification.
no code implementations • 13 Aug 2020 • Zhen Zeng, Jianzong Wang, Ning Cheng, Jing Xiao
Recent neural speech synthesis systems have gradually focused on the control of prosody to improve the quality of synthesized speech, but they rarely consider the variability of prosody and the correlation between prosody and semantics together.
5 code implementations • 27 Jul 2020 • Chaoyang He, Songze Li, Jinhyun So, Xiao Zeng, Mi Zhang, Hongyi Wang, Xiaoyang Wang, Praneeth Vepakomma, Abhishek Singh, Hang Qiu, Xinghua Zhu, Jianzong Wang, Li Shen, Peilin Zhao, Yan Kang, Yang Liu, Ramesh Raskar, Qiang Yang, Murali Annavaram, Salman Avestimehr
Federated learning (FL) is a rapidly growing research field in machine learning.
no code implementations • 20 May 2020 • Linhao Dong, Cheng Yi, Jianzong Wang, Shiyu Zhou, Shuang Xu, Xueli Jia, Bo Xu
End-to-end models are gaining wider attention in the field of automatic speech recognition (ASR).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +1
no code implementations • 9 Apr 2020 • xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao
Most singer identification methods are processed in the frequency domain, which potentially leads to information loss during the spectral transformation.
2 code implementations • 4 Mar 2020 • Zhen Zeng, Jianzong Wang, Ning Cheng, Tian Xia, Jing Xiao
Targeting at both high efficiency and performance, we propose AlignTTS to predict the mel-spectrum in parallel.
no code implementations • 4 Mar 2020 • Aolan Sun, Jianzong Wang, Ning Cheng, Huayi Peng, Zhen Zeng, Jing Xiao
This paper leverages the graph-to-sequence method in neural text-to-speech (GraphTTS), which maps the graph embedding of the input sequence to spectrograms.
no code implementations • 4 Mar 2020 • Chen Feng, Jianzong Wang, Tongxu Li, Junqing Peng, Jing Xiao
Recently, the speaker clustering model based on aggregation hierarchy cluster (AHC) is a common method to solve two main problems: no preset category number clustering and fix category number clustering.
1 code implementation • 22 Mar 2019 • Sein Minn, Michel C. Desmarais, Feida Zhu, Jing Xiao, Jianzong Wang
Knowledge Tracing (KT) is the assessment of student’s knowledge state and predicting whether that student may or may not answer the next problem correctly based on a number of previous practices and outcomes in their learning process.