1 code implementation • 21 Nov 2023 • Linfeng Feng, Xiao-Lei Zhang, Xuelong Li
To address this, we propose an Unbiased Label Distribution (ULD) to eliminate quantization error in training targets.
no code implementations • 22 Oct 2023 • Yibo Bai, Xiao-Lei Zhang
Recently, automatic speaker verification (ASV) based on deep learning is easily contaminated by adversarial attacks, which is a new type of attack that injects imperceptible perturbations to audio signals so as to make ASV produce wrong decisions.
no code implementations • 15 Apr 2023 • Linfeng Feng, Yijun Gong, Xiao-Lei Zhang
The core idea is to take the geometric connection between the classes into the label coding process. The first one is named static soft label coding (SSLC), which modifies the one-hot codes into soft codes based on the distances between the local areas.
no code implementations • 27 Feb 2023 • Jun Qi, Xiao-Lei Zhang, Javier Tejedor
In this work, we propose an efficient optimization algorithm, namely federated quantum natural gradient descent (FQNGD), and further, apply it to a QFL framework that is composed of a variational quantum circuit (VQC)-based quantum neural networks (QNN).
no code implementations • 21 Feb 2023 • Jiadi Yao, Hong Luo, Xiao-Lei Zhang
Unlike existing approaches that operate voices in the time domain, the proposed framework operates voices in the time-frequency domain, which improves the interpretability, transferability, and imperceptibility of the attack.
no code implementations • 2 Nov 2022 • Xing Chen, Jie Wang, Xiao-Lei Zhang, Wei-Qiang Zhang, Kunde Yang
It utilizes score variation as an indicator to detect adversarial examples, where the score variation is the absolute discrepancy between the ASV scores of an original audio recording and its transformed audio synthesized from its masked complex spectrogram.
1 code implementation • 30 Oct 2022 • Jie Wang, Menglong Xu, Jingyong Hou, BinBin Zhang, Xiao-Lei Zhang, Lei Xie, Fuping Pan
Keyword spotting (KWS) enables speech-based user interaction and gradually becomes an indispensable component of smart devices.
no code implementations • 30 Oct 2022 • Jiadi Yao, Xing Chen, Xiao-Lei Zhang, Wei-Qiang Zhang, Kunde Yang
Adversarial attack approaches to speaker identification either need high computational cost or are not very effective, to our knowledge.
no code implementations • 19 Oct 2022 • Shupei Liu, Linfeng Feng, Yijun Gong, Chengdong Liang, Chen Zhang, Xiao-Lei Zhang, Xuelong Li
To further boost the estimation accuracy, we introduce a node selection algorithm that strategically filters the most reliable nodes.
no code implementations • 16 Oct 2022 • Yijun Gong, Shupei Liu, Xiao-Lei Zhang
Accordingly, the sound source localization problem is formulated as such a classification task of recognizing the one-hot code of the speaker given the one hot codes of the microphone nodes and their speech recordings.
1 code implementation • 25 Jul 2022 • Jie Wang, Xiao-Lei Zhang
In this paper, we propose a novel approach to improve the accuracy of the pseudo labels in the target domain.
no code implementations • 13 Jul 2021 • Shengqiang Li, Menglong Xu, Xiao-Lei Zhang
To make use of the time order of the input sequence, many works inject some information about the relative or absolute position of the element into the input sequence.
no code implementations • 13 Jul 2021 • Menglong Xu, Shengqiang Li, Chengdong Liang, Xiao-Lei Zhang
Deep neural networks provide effective solutions to small-footprint keyword spotting (KWS).
no code implementations • 5 Jul 2021 • Xiao-Lei Zhang
Empirically, comparing to a number of advanced deep clustering methods and as many as 20 representative unsupervised ensemble learning and selection methods, the proposed methods reach the state-of-the-art performance without manual hyperparameter tuning.
no code implementations • 29 Mar 2021 • Junqi Chen, Xiao-Lei Zhang
Because Sparsemax punishes the weights of many channels to zero harshly, we propose Scaling Sparsemax which punishes the channels mildly by setting the weights of very noisy channels to zero only.
no code implementations • 29 Mar 2021 • Chengdong Liang, Menglong Xu, Xiao-Lei Zhang
Although the performance of the proposed resGSA-Transformer is only slightly better than that of the RPSA-Transformer, it does not have to tune the window length manually.
1 code implementation • 28 Mar 2021 • Shanzheng Guan, Shupei Liu, Junqi Chen, Wenbo Zhu, Shengqiang Li, Xu Tan, Ziye Yang, Menglong Xu, Yijiang Chen, Jianyu Wang, Xiao-Lei Zhang
We trained several multi-device speech recognition systems on both the Libri-adhoc40 dataset and a simulated dataset.
no code implementations • 24 Feb 2021 • Jianyu Wang, Xiao-Lei Zhang
In this paper, we propose a deep NMF (DNMF) topic modeling framework to alleviate the aforementioned problems.
no code implementations • 2 Dec 2020 • Zhongxin Bai, Xiao-Lei Zhang
Speaker recognition is a task of identifying persons from their voices.
no code implementations • 1 Dec 2020 • Ziye Yang, Shanzheng Guan, Xiao-Lei Zhang
In this paper, we propose deep ad-hoc beamforming based on speaker extraction, which is to our knowledge the first work for target-dependent speech separation based on ad-hoc microphone arrays and deep learning.
no code implementations • 29 Nov 2020 • Wenbo Zhu, Mou Wang, Xiao-Lei Zhang, Susanto Rahardja
Among them, learnable features, which are trained with separation networks jointly in an end-to-end fashion, become a new trend of modern speech separation research, e. g. convolutional time domain audio separation network (Conv-Tasnet), while handcrafted and parameterized features are also shown competitive in very recent studies.
Sound
1 code implementation • 23 Oct 2020 • Menglong Xu, Shengqiang Li, Xiao-Lei Zhang
To reduce the computational complexity and improve the performance, we further propose local DSA (LDSA) to restrict the attention scope of DSA to a local range around the current central frame for speech recognition.
no code implementations • 23 Oct 2020 • Xu Tan, Xiao-Lei Zhang
Recent studies show that speech enhancement is helpful to VAD, but the performance improvement is limited.
no code implementations • 19 Nov 2019 • Zhongxin Bai, Xiao-Lei Zhang, Jingdong Chen
We also propose a class-center based training trial construction method to improve the training efficiency, which is critical for the proposed loss function to be comparable to the identification loss in performance.
no code implementations • 24 Oct 2019 • Jianyu Wang, Xiao-Lei Zhang
Specifically, we first apply multilayer bootstrap network (MBN), which is an unsupervised deep model, to reduce the dimension of documents, and then use the low-dimensional data representations or their clustering results as the target of supervised Lasso for topic word discovery.
no code implementations • 24 Oct 2019 • Ziye Yang, Xiao-Lei Zhang
To deal with the problem, we propose a variant of DPCL, named DPCL++, by applying a recent unsupervised deep learning method---multilayer bootstrap networks(MBN)---to further reduce the noise and small variations of the embedding vectors in an unsupervised way in the test stage, which fascinates k-means to produce a good result.
1 code implementation • 3 Nov 2018 • Xiao-Lei Zhang
Its core idea is to reweight the estimated speech signals with a sparsity constraint when conducting adaptive beamforming, where the weights produced by a neural network are an estimation of the propagation cost from the speech source to the ad-hoc microphone array, e. g. signal-to-noise ratios, and the sparsity constraint is to filter out the microphones that are too far away from both the speech source and the majority of the ad-hoc microphone array.
Sound Audio and Speech Processing
no code implementations • 1 Aug 2017 • Xiao-Lei Zhang
In this abstract paper, we introduce a new kernel learning method by a nonparametric density estimator.
no code implementations • 21 Sep 2015 • Xiao-Lei Zhang
We apply multilayer bootstrap network (MBN), a recent proposed unsupervised learning method, to unsupervised speaker recognition.
no code implementations • 18 Aug 2015 • Xiao-Lei Zhang, Yong Han, DongSheng Hao, Zhihan Lv
This is the preprint version of our paper on ICONIP.
no code implementations • 22 Mar 2015 • Xiao-Lei Zhang
Our result suggests that the new technique integrates the effectiveness of MBN on unsupervised learning and the effectiveness and efficiency of DNN on supervised learning together for the effectiveness and efficiency of compressive MBN on unsupervised learning.
no code implementations • 3 Dec 2014 • Xiao-Lei Zhang
In this paper, we further extend it to supervised learning incrementally.
no code implementations • 5 Aug 2014 • Xiao-Lei Zhang
Geometrically, the nonparametric density estimator at each layer projects the input data space to a uniformly-distributed discrete feature space, where the similarity of two data points in the discrete feature space is measured by the number of the nearest centroids they share in common.
no code implementations • 16 Dec 2013 • Xiao-Lei Zhang
In this paper, we propose an extremely simple deep model for the unsupervised nonlinear dimensionality reduction -- deep distributed random samplings, which performs like a stack of unsupervised bootstrap aggregating.
no code implementations • 22 Aug 2013 • Xiao-Lei Zhang
Restricted Boltzman machine, sparse coding, regularized auto-encoders, and convolutional neural networks are pioneering building blocks of deep learning.
no code implementations • 5 May 2013 • Xiao-Lei Zhang, Ji Wu
(ii) Based on the above two views, we propose a very simple deep learning algorithm, named deep random model ensemble (DRME).
no code implementations • 8 Mar 2013 • Xiao-Lei Zhang
One important classifier ensemble for multiclass classification problems is Error-Correcting Output Codes (ECOCs).
no code implementations • 8 Mar 2013 • Xiao-Lei Zhang
Then, we propose two convex DMTC objectives within the framework.