Search Results for author: Jiachen Lian

Found 11 papers, 5 papers with code

Towards Hierarchical Spoken Language Dysfluency Modeling

no code implementations18 Jan 2024 Jiachen Lian, Gopala Anumanchipalli

Speech disfluency modeling is the bottleneck for both speech therapy and language learning.

Deep Speech Synthesis from MRI-Based Articulatory Representations

1 code implementation5 Jul 2023 Peter Wu, Tingle Li, Yijing Lu, Yubin Zhang, Jiachen Lian, Alan W Black, Louis Goldstein, Shinji Watanabe, Gopala K. Anumanchipalli

Finally, through a series of ablations, we show that the proposed MRI representation is more comprehensive than EMA and identify the most suitable MRI feature subset for articulatory synthesis.

Computational Efficiency Denoising +1

AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations

no code implementations10 Feb 2023 Jiachen Lian, Alexei Baevski, Wei-Ning Hsu, Michael Auli

Self-supervision has shown great potential for audio-visual speech recognition by vastly reducing the amount of labeled data required to build good systems.

Audio-Visual Speech Recognition Self-Supervised Learning +2

Articulatory Representation Learning Via Joint Factor Analysis and Neural Matrix Factorization

no code implementations29 Oct 2022 Jiachen Lian, Alan W Black, Yijing Lu, Louis Goldstein, Shinji Watanabe, Gopala K. Anumanchipalli

In this work, we propose a novel articulatory representation decomposition algorithm that takes the advantage of guided factor analysis to derive the articulatory-specific factors and factor scores.

Representation Learning

UTTS: Unsupervised TTS with Conditional Disentangled Sequential Variational Auto-encoder

no code implementations6 Jun 2022 Jiachen Lian, Chunlei Zhang, Gopala Krishna Anumanchipalli, Dong Yu

We leverage recent advancements in self-supervised speech representation learning as well as speech synthesis front-end techniques for system development.

Representation Learning Speech Synthesis +1

Towards Improved Zero-shot Voice Conversion with Conditional DSVAE

1 code implementation11 May 2022 Jiachen Lian, Chunlei Zhang, Gopala Krishna Anumanchipalli, Dong Yu

In our experiment on the VCTK dataset, we demonstrate that content embeddings derived from the conditional DSVAE overcome the randomness and achieve a much better phoneme classification accuracy, a stabilized vocalization and a better zero-shot VC performance compared with the competitive DSVAE baseline.

Voice Conversion

Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition

1 code implementation1 Apr 2022 Jiachen Lian, Alan W Black, Louis Goldstein, Gopala Krishna Anumanchipalli

Most of the research on data-driven speech representation learning has focused on raw audios in an end-to-end manner, paying little attention to their internal phonological or gestural structure.

Representation Learning

Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion

1 code implementation30 Mar 2022 Jiachen Lian, Chunlei Zhang, Dong Yu

A zero-shot voice conversion is performed by feeding an arbitrary speaker embedding and content embeddings to the VAE decoder.

Data Augmentation Disentanglement +2

Cannot find the paper you are looking for? You can Submit a new open access paper.