Search Results for author: Jing-Xuan Zhang

Found 8 papers, 1 papers with code

Non-Parallel Sequence-to-Sequence Voice Conversion with Disentangled Linguistic and Speaker Representations

1 code implementation25 Jun 2019 Jing-Xuan Zhang, Zhen-Hua Ling, Li-Rong Dai

In this method, disentangled linguistic and speaker representations are extracted from acoustic features, and voice conversion is achieved by preserving the linguistic representations of source utterances while replacing the speaker representations with the target ones.

Audio and Speech Processing Sound

Deep Learning in Software Engineering

no code implementations13 May 2018 Xiaochen Li, He Jiang, Zhilei Ren, Ge Li, Jing-Xuan Zhang

To answer these questions, we conduct a bibliography analysis on 98 research papers in SE that use deep learning techniques.

Software Engineering

Forward Attention in Sequence-to-sequence Acoustic Modelling for Speech Synthesis

no code implementations18 Jul 2018 Jing-Xuan Zhang, Zhen-Hua Ling, Li-Rong Dai

This paper proposes a forward attention method for the sequenceto- sequence acoustic modeling of speech synthesis.

Acoustic Modelling Speech Synthesis

Voice Conversion by Cascading Automatic Speech Recognition and Text-to-Speech Synthesis with Prosody Transfer

no code implementations3 Sep 2020 Jing-Xuan Zhang, Li-Juan Liu, Yan-Nian Chen, Ya-Jun Hu, Yuan Jiang, Zhen-Hua Ling, Li-Rong Dai

In this paper, we present a ASR-TTS method for voice conversion, which used iFLYTEK ASR engine to transcribe the source speech into text and a Transformer TTS model with WaveNet vocoder to synthesize the converted speech from the decoded text.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Is Lip Region-of-Interest Sufficient for Lipreading?

no code implementations28 May 2022 Jing-Xuan Zhang, Gen-Shun Wan, Jia Pan

In this work, we propose to adopt the entire face for lipreading with self-supervised learning.

Lipreading Self-Supervised Learning +2

Self-Supervised Audio-Visual Speech Representations Learning By Multimodal Self-Distillation

no code implementations6 Dec 2022 Jing-Xuan Zhang, Genshun Wan, Zhen-Hua Ling, Jia Pan, Jianqing Gao, Cong Liu

AV2vec has a student and a teacher module, in which the student performs a masked latent feature regression task using the multimodal target features generated online by the teacher.

Language Modelling

Cannot find the paper you are looking for? You can Submit a new open access paper.