Search Results for author: Xiao-Lei Zhang

Found 38 papers, 6 papers with code

Eliminating Quantization Errors in Classification-Based Sound Source Localization

1 code implementation21 Nov 2023 Linfeng Feng, Xiao-Lei Zhang, Xuelong Li

To address this, we propose an Unbiased Label Distribution (ULD) to eliminate quantization error in training targets.

Classification Quantization +1

Diffusion-Based Adversarial Purification for Speaker Verification

no code implementations22 Oct 2023 Yibo Bai, Xiao-Lei Zhang

Recently, automatic speaker verification (ASV) based on deep learning is easily contaminated by adversarial attacks, which is a new type of attack that injects imperceptible perturbations to audio signals so as to make ASV produce wrong decisions.

Denoising Speaker Verification

Soft Label Coding for End-to-end Sound Source Localization With Ad-hoc Microphone Arrays

no code implementations15 Apr 2023 Linfeng Feng, Yijun Gong, Xiao-Lei Zhang

The core idea is to take the geometric connection between the classes into the label coding process. The first one is named static soft label coding (SSLC), which modifies the one-hot codes into soft codes based on the distances between the local areas.

Quantization

Optimizing Quantum Federated Learning Based on Federated Quantum Natural Gradient Descent

no code implementations27 Feb 2023 Jun Qi, Xiao-Lei Zhang, Javier Tejedor

In this work, we propose an efficient optimization algorithm, namely federated quantum natural gradient descent (FQNGD), and further, apply it to a QFL framework that is composed of a variational quantum circuit (VQC)-based quantum neural networks (QNN).

Federated Learning

Interpretable Spectrum Transformation Attacks to Speaker Recognition

no code implementations21 Feb 2023 Jiadi Yao, Hong Luo, Xiao-Lei Zhang

Unlike existing approaches that operate voices in the time domain, the proposed framework operates voices in the time-frequency domain, which improves the interpretability, transferability, and imperceptibility of the attack.

Speaker Recognition

LMD: A Learnable Mask Network to Detect Adversarial Examples for Speaker Verification

no code implementations2 Nov 2022 Xing Chen, Jie Wang, Xiao-Lei Zhang, Wei-Qiang Zhang, Kunde Yang

It utilizes score variation as an indicator to detect adversarial examples, where the score variation is the absolute discrepancy between the ASV scores of an original audio recording and its transformed audio synthesized from its masked complex spectrogram.

Speaker Verification

WeKws: A production first small-footprint end-to-end Keyword Spotting Toolkit

1 code implementation30 Oct 2022 Jie Wang, Menglong Xu, Jingyong Hou, BinBin Zhang, Xiao-Lei Zhang, Lei Xie, Fuping Pan

Keyword spotting (KWS) enables speech-based user interaction and gradually becomes an indispensable component of smart devices.

Keyword Spotting

Symmetric Saliency-based Adversarial Attack To Speaker Identification

no code implementations30 Oct 2022 Jiadi Yao, Xing Chen, Xiao-Lei Zhang, Wei-Qiang Zhang, Kunde Yang

Adversarial attack approaches to speaker identification either need high computational cost or are not very effective, to our knowledge.

Adversarial Attack Speaker Identification

Deep Learning Based Stage-wise Two-dimensional Speaker Localization with Large Ad-hoc Microphone Arrays

no code implementations19 Oct 2022 Shupei Liu, Linfeng Feng, Yijun Gong, Chengdong Liang, Chen Zhang, Xiao-Lei Zhang, Xuelong Li

To further boost the estimation accuracy, we introduce a node selection algorithm that strategically filters the most reliable nodes.

End-to-end Two-dimensional Sound Source Localization With Ad-hoc Microphone Arrays

no code implementations16 Oct 2022 Yijun Gong, Shupei Liu, Xiao-Lei Zhang

Accordingly, the sound source localization problem is formulated as such a classification task of recognizing the one-hot code of the speaker given the one hot codes of the microphone nodes and their speech recordings.

Improving Pseudo Labels With Intra-Class Similarity for Unsupervised Domain Adaptation

1 code implementation25 Jul 2022 Jie Wang, Xiao-Lei Zhang

In this paper, we propose a novel approach to improve the accuracy of the pseudo labels in the target domain.

Unsupervised Domain Adaptation

Conformer-based End-to-end Speech Recognition With Rotary Position Embedding

no code implementations13 Jul 2021 Shengqiang Li, Menglong Xu, Xiao-Lei Zhang

To make use of the time order of the input sequence, many works inject some information about the relative or absolute position of the element into the input sequence.

Position speech-recognition +1

fMBN-E: Efficient Unsupervised Network Structure Ensemble and Selection for Clustering

no code implementations5 Jul 2021 Xiao-Lei Zhang

Empirically, comparing to a number of advanced deep clustering methods and as many as 20 representative unsupervised ensemble learning and selection methods, the proposed methods reach the state-of-the-art performance without manual hyperparameter tuning.

Clustering Deep Clustering +4

Scaling sparsemax based channel selection for speech recognition with ad-hoc microphone arrays

no code implementations29 Mar 2021 Junqi Chen, Xiao-Lei Zhang

Because Sparsemax punishes the weights of many channels to zero harshly, we propose Scaling Sparsemax which punishes the channels mildly by setting the weights of very noisy channels to zero only.

speech-recognition Speech Recognition

Transformer-based end-to-end speech recognition with residual Gaussian-based self-attention

no code implementations29 Mar 2021 Chengdong Liang, Menglong Xu, Xiao-Lei Zhang

Although the performance of the proposed resGSA-Transformer is only slightly better than that of the RPSA-Transformer, it does not have to tune the window length manually.

speech-recognition Speech Recognition

Deep NMF Topic Modeling

no code implementations24 Feb 2021 Jianyu Wang, Xiao-Lei Zhang

In this paper, we propose a deep NMF (DNMF) topic modeling framework to alleviate the aforementioned problems.

Deep Ad-hoc Beamforming Based on Speaker Extraction for Target-Dependent Speech Separation

no code implementations1 Dec 2020 Ziye Yang, Shanzheng Guan, Xiao-Lei Zhang

In this paper, we propose deep ad-hoc beamforming based on speaker extraction, which is to our knowledge the first work for target-dependent speech separation based on ad-hoc microphone arrays and deep learning.

Speech Enhancement Speech Separation

A comparison of handcrafted, parameterized, and learnable features for speech separation

no code implementations29 Nov 2020 Wenbo Zhu, Mou Wang, Xiao-Lei Zhang, Susanto Rahardja

Among them, learnable features, which are trained with separation networks jointly in an end-to-end fashion, become a new trend of modern speech separation research, e. g. convolutional time domain audio separation network (Conv-Tasnet), while handcrafted and parameterized features are also shown competitive in very recent studies.

Sound

Transformer-based End-to-End Speech Recognition with Local Dense Synthesizer Attention

1 code implementation23 Oct 2020 Menglong Xu, Shengqiang Li, Xiao-Lei Zhang

To reduce the computational complexity and improve the performance, we further propose local DSA (LDSA) to restrict the attention scope of DSA to a local range around the current central frame for speech recognition.

speech-recognition Speech Recognition

Speech enhancement aided end-to-end multi-task learning for voice activity detection

no code implementations23 Oct 2020 Xu Tan, Xiao-Lei Zhang

Recent studies show that speech enhancement is helpful to VAD, but the performance improvement is limited.

Action Detection Activity Detection +3

Partial AUC optimization based deep speaker embeddings with class-center learning for text-independent speaker verification

no code implementations19 Nov 2019 Zhongxin Bai, Xiao-Lei Zhang, Jingdong Chen

We also propose a class-center based training trial construction method to improve the training efficiency, which is critical for the proposed loss function to be comparable to the identification loss in performance.

Text-Independent Speaker Verification

Deep topic modeling by multilayer bootstrap network and lasso

no code implementations24 Oct 2019 Jianyu Wang, Xiao-Lei Zhang

Specifically, we first apply multilayer bootstrap network (MBN), which is an unsupervised deep model, to reduce the dimension of documents, and then use the low-dimensional data representations or their clustering results as the target of supervised Lasso for topic word discovery.

Clustering Dimensionality Reduction +1

Multi-channel Speech Separation Using Deep Embedding Model with Multilayer Bootstrap Networks

no code implementations24 Oct 2019 Ziye Yang, Xiao-Lei Zhang

To deal with the problem, we propose a variant of DPCL, named DPCL++, by applying a recent unsupervised deep learning method---multilayer bootstrap networks(MBN)---to further reduce the noise and small variations of the embedding vectors in an unsupervised way in the test stage, which fascinates k-means to produce a good result.

Clustering Deep Clustering +1

Deep Ad-hoc Beamforming

1 code implementation3 Nov 2018 Xiao-Lei Zhang

Its core idea is to reweight the estimated speech signals with a sparsity constraint when conducting adaptive beamforming, where the weights produced by a neural network are an estimation of the propagation cost from the speech source to the ad-hoc microphone array, e. g. signal-to-noise ratios, and the sparsity constraint is to filter out the microphones that are too far away from both the speech source and the majority of the ad-hoc microphone array.

Sound Audio and Speech Processing

Learning the kernel matrix by resampling

no code implementations1 Aug 2017 Xiao-Lei Zhang

In this abstract paper, we introduce a new kernel learning method by a nonparametric density estimator.

Clustering

Multilayer bootstrap network for unsupervised speaker recognition

no code implementations21 Sep 2015 Xiao-Lei Zhang

We apply multilayer bootstrap network (MBN), a recent proposed unsupervised learning method, to unsupervised speaker recognition.

Clustering Speaker Recognition

Unsupervised model compression for multilayer bootstrap networks

no code implementations22 Mar 2015 Xiao-Lei Zhang

Our result suggests that the new technique integrates the effectiveness of MBN on unsupervised learning and the effectiveness and efficiency of DNN on supervised learning together for the effectiveness and efficiency of compressive MBN on unsupervised learning.

Dimensionality Reduction Model Compression

Multilayer bootstrap networks

no code implementations5 Aug 2014 Xiao-Lei Zhang

Geometrically, the nonparametric density estimator at each layer projects the input data space to a uniformly-distributed discrete feature space, where the similarity of two data points in the discrete feature space is measured by the number of the nearest centroids they share in common.

Clustering Dimensionality Reduction

Learning Deep Representations By Distributed Random Samplings

no code implementations16 Dec 2013 Xiao-Lei Zhang

In this paper, we propose an extremely simple deep model for the unsupervised nonlinear dimensionality reduction -- deep distributed random samplings, which performs like a stack of unsupervised bootstrap aggregating.

Clustering Dimensionality Reduction

Learning Deep Representation Without Parameter Inference for Nonlinear Dimensionality Reduction

no code implementations22 Aug 2013 Xiao-Lei Zhang

Restricted Boltzman machine, sparse coding, regularized auto-encoders, and convolutional neural networks are pioneering building blocks of deep learning.

Clustering Dimensionality Reduction +1

Simple Deep Random Model Ensemble

no code implementations5 May 2013 Xiao-Lei Zhang, Ji Wu

(ii) Based on the above two views, we propose a very simple deep learning algorithm, named deep random model ensemble (DRME).

Clustering Clustering Ensemble +2

Convex Discriminative Multitask Clustering

no code implementations8 Mar 2013 Xiao-Lei Zhang

Then, we propose two convex DMTC objectives within the framework.

Clustering

Cannot find the paper you are looking for? You can Submit a new open access paper.