Search Results for author: Zexin Cai

Found 14 papers, 3 papers with code

A Novel Learnable Dictionary Encoding Layer for End-to-End Language Identification

no code implementations • 2 Apr 2018 • Weicheng Cai, Zexin Cai, Xiang Zhang, Xiaoqi Wang, Ming Li

A novel learnable dictionary encoding layer is proposed in this paper for end-to-end language identification.

Paper
Add Code

Insights into End-to-End Learning Scheme for Language Identification

no code implementations • 2 Apr 2018 • Weicheng Cai, Zexin Cai, Wenbo Liu, Xiaoqi Wang, Ming Li

After comparing with the state-of-the-art GMM i-vector methods, we give insights into CNN, and reveal its role and effect in the whole pipeline.

Language Identification

Paper
Add Code

Polyphone Disambiguation for Mandarin Chinese Using Conditional Neural Network with Multi-level Embedding Features

no code implementations • 3 Jul 2019 • Zexin Cai, Yaogen Yang, Chuxiong Zhang, Xiaoyi Qin, Ming Li

This paper describes a conditional neural network architecture for Mandarin Chinese polyphone disambiguation.

Polyphone disambiguation Sentence

Paper
Add Code

Cross-lingual Multispeaker Text-to-Speech under Limited-Data Scenario

no code implementations • 21 May 2020 • Zexin Cai, Yaogen Yang, Ming Li

In addition, we investigate the model's performance on the cross-lingual synthesis, with and without a bilingual dataset during training.

Attribute Speech Synthesis

Paper
Add Code

Training Wake Word Detection with Synthesized Speech Data on Confusion Words

no code implementations • 3 Nov 2020 • Yan Jia, Zexin Cai, Murong Ma, Zeqing Zhao, Xuyang Wang, Junjie Wang, Ming Li

Confusing-words are commonly encountered in real-life keyword spotting applications, which causes severe degradation of performance due to complex spoken terms and various kinds of words that sound similar to the predefined keywords.

Data Augmentation Keyword Spotting +1

Paper
Add Code

The DKU System Description for The Interspeech 2021 Auto-KWS Challenge

no code implementations • 11 Apr 2021 • Yechen Wang, Yan Jia, Murong Ma, Zexin Cai, Ming Li

This paper introduces the system submitted by the DKU-SMIIP team for the Auto-KWS 2021 Challenge.

Dynamic Time Warping Keyword Spotting +3

Paper
Add Code

Invertible Voice Conversion

no code implementations • 26 Jan 2022 • Zexin Cai, Ming Li

In this paper, we propose an invertible deep learning framework called INVVC for voice conversion.

Voice Conversion

Paper
Add Code

Identifying Source Speakers for Voice Conversion based Spoofing Attacks on Speaker Verification Systems

no code implementations • 18 Jun 2022 • Danwei Cai, Zexin Cai, Ming Li

An automatic speaker verification system aims to verify the speaker identity of a speech signal.

Speaker Identification Speaker Verification +1

Paper
Add Code

Waveform Boundary Detection for Partially Spoofed Audio

no code implementations • 1 Nov 2022 • Zexin Cai, Weiqing Wang, Ming Li

The present paper proposes a waveform boundary detection system for audio spoofing attacks containing partially manipulated segments.

Boundary Detection

Paper
Add Code

The DKU-DUKEECE System for the Manipulation Region Location Task of ADD 2023

no code implementations • 20 Aug 2023 • Zexin Cai, Weiqing Wang, Yikang Wang, Ming Li

This paper introduces our system designed for Track 2, which focuses on locating manipulated regions, in the second Audio Deepfake Detection Challenge (ADD 2023).

Boundary Detection DeepFake Detection +2

Paper
Add Code

Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning

no code implementations • 3 Jan 2024 • Danwei Cai, Zexin Cai, Ming Li

Specifically, a teacher model continually refines pseudo labels through online clustering, providing dynamic supervision signals to train the student model.

Clustering Knowledge Distillation +3

Paper
Add Code

SIG-VC: A Speaker Information Guided Zero-shot Voice Conversion System for Both Human Beings and Machines

1 code implementation • 6 Nov 2021 • Haozhe Zhang, Zexin Cai, Xiaoyi Qin, Ming Li

Moreover, speaker information control is added to our system to maintain the voice cloning performance.

Disentanglement Speaker Verification +2

Paper
Code

End-to-end Language Identification using NetFV and NetVLAD

1 code implementation • 9 Sep 2018 • Jinkun Chen, Weicheng Cai, Danwei Cai, Zexin Cai, Haibin Zhong, Ming Li

In this paper, we apply the NetFV and NetVLAD layers for the end-to-end language identification task.

Language Identification

Paper
Code

From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint

1 code implementation • 10 May 2020 • Zexin Cai, Chuxiong Zhang, Ming Li

The constraint is taken by an added loss related to the speaker identity, which is centralized to improve the speaker similarity between the synthesized speech and its natural reference audio.

Speaker Verification Speech Synthesis +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.