Search Results for author: Zexin Cai

Found 14 papers, 3 papers with code

Self-supervised Reflective Learning through Self-distillation and Online Clustering for Speaker Representation Learning

no code implementations3 Jan 2024 Danwei Cai, Zexin Cai, Ming Li

Specifically, a teacher model continually refines pseudo labels through online clustering, providing dynamic supervision signals to train the student model.

Clustering Knowledge Distillation +3

The DKU-DUKEECE System for the Manipulation Region Location Task of ADD 2023

no code implementations20 Aug 2023 Zexin Cai, Weiqing Wang, Yikang Wang, Ming Li

This paper introduces our system designed for Track 2, which focuses on locating manipulated regions, in the second Audio Deepfake Detection Challenge (ADD 2023).

Boundary Detection DeepFake Detection +2

Waveform Boundary Detection for Partially Spoofed Audio

no code implementations1 Nov 2022 Zexin Cai, Weiqing Wang, Ming Li

The present paper proposes a waveform boundary detection system for audio spoofing attacks containing partially manipulated segments.

Boundary Detection

Invertible Voice Conversion

no code implementations26 Jan 2022 Zexin Cai, Ming Li

In this paper, we propose an invertible deep learning framework called INVVC for voice conversion.

Voice Conversion

Training Wake Word Detection with Synthesized Speech Data on Confusion Words

no code implementations3 Nov 2020 Yan Jia, Zexin Cai, Murong Ma, Zeqing Zhao, Xuyang Wang, Junjie Wang, Ming Li

Confusing-words are commonly encountered in real-life keyword spotting applications, which causes severe degradation of performance due to complex spoken terms and various kinds of words that sound similar to the predefined keywords.

Data Augmentation Keyword Spotting +1

Cross-lingual Multispeaker Text-to-Speech under Limited-Data Scenario

no code implementations21 May 2020 Zexin Cai, Yaogen Yang, Ming Li

In addition, we investigate the model's performance on the cross-lingual synthesis, with and without a bilingual dataset during training.

Attribute Speech Synthesis

From Speaker Verification to Multispeaker Speech Synthesis, Deep Transfer with Feedback Constraint

1 code implementation10 May 2020 Zexin Cai, Chuxiong Zhang, Ming Li

The constraint is taken by an added loss related to the speaker identity, which is centralized to improve the speaker similarity between the synthesized speech and its natural reference audio.

Speaker Verification Speech Synthesis +1

End-to-end Language Identification using NetFV and NetVLAD

1 code implementation9 Sep 2018 Jinkun Chen, Weicheng Cai, Danwei Cai, Zexin Cai, Haibin Zhong, Ming Li

In this paper, we apply the NetFV and NetVLAD layers for the end-to-end language identification task.

Language Identification

A Novel Learnable Dictionary Encoding Layer for End-to-End Language Identification

no code implementations2 Apr 2018 Weicheng Cai, Zexin Cai, Xiang Zhang, Xiaoqi Wang, Ming Li

A novel learnable dictionary encoding layer is proposed in this paper for end-to-end language identification.

Language Identification

Insights into End-to-End Learning Scheme for Language Identification

no code implementations2 Apr 2018 Weicheng Cai, Zexin Cai, Wenbo Liu, Xiaoqi Wang, Ming Li

After comparing with the state-of-the-art GMM i-vector methods, we give insights into CNN, and reveal its role and effect in the whole pipeline.

Language Identification

Cannot find the paper you are looking for? You can Submit a new open access paper.