Search Results for author: Jiachen Lian

Found 11 papers, 5 papers with code

VoxGenesis: Unsupervised Discovery of Latent Speaker Manifold for Speech Synthesis

no code implementations • 1 Mar 2024 • Weiwei Lin, Chenhang He, Man-Wai Mak, Jiachen Lian, Kong Aik Lee

This forces the model to learn a speaker distribution disentangled from the semantic content.

Speech Synthesis

Paper
Add Code

Towards Hierarchical Spoken Language Dysfluency Modeling

no code implementations • 18 Jan 2024 • Jiachen Lian, Gopala Anumanchipalli

Speech disfluency modeling is the bottleneck for both speech therapy and language learning.

Paper
Add Code

Unconstrained Dysfluency Modeling for Dysfluent Speech Transcription and Detection

no code implementations • 20 Dec 2023 • Jiachen Lian, Carly Feng, Naasir Farooqi, Steve Li, Anshul Kashyap, Cheol Jun Cho, Peter Wu, Robbie Netzorg, Tingle Li, Gopala Krishna Anumanchipalli

Dysfluent speech modeling requires time-accurate and silence-aware transcription at both the word-level and phonetic-level.

Paper
Add Code

Deep Speech Synthesis from MRI-Based Articulatory Representations

1 code implementation • 5 Jul 2023 • Peter Wu, Tingle Li, Yijing Lu, Yubin Zhang, Jiachen Lian, Alan W Black, Louis Goldstein, Shinji Watanabe, Gopala K. Anumanchipalli

Finally, through a series of ablations, we show that the proposed MRI representation is more comprehensive than EMA and identify the most suitable MRI feature subset for articulatory synthesis.

Computational Efficiency Denoising +1

Paper
Code

AV-data2vec: Self-supervised Learning of Audio-Visual Speech Representations with Contextualized Target Representations

no code implementations • 10 Feb 2023 • Jiachen Lian, Alexei Baevski, Wei-Ning Hsu, Michael Auli

Self-supervision has shown great potential for audio-visual speech recognition by vastly reducing the amount of labeled data required to build good systems.

Audio-Visual Speech Recognition Self-Supervised Learning +2

Paper
Add Code

Articulatory Representation Learning Via Joint Factor Analysis and Neural Matrix Factorization

no code implementations • 29 Oct 2022 • Jiachen Lian, Alan W Black, Yijing Lu, Louis Goldstein, Shinji Watanabe, Gopala K. Anumanchipalli

In this work, we propose a novel articulatory representation decomposition algorithm that takes the advantage of guided factor analysis to derive the articulatory-specific factors and factor scores.

Representation Learning

Paper
Add Code

UTTS: Unsupervised TTS with Conditional Disentangled Sequential Variational Auto-encoder

no code implementations • 6 Jun 2022 • Jiachen Lian, Chunlei Zhang, Gopala Krishna Anumanchipalli, Dong Yu

We leverage recent advancements in self-supervised speech representation learning as well as speech synthesis front-end techniques for system development.

Representation Learning Speech Synthesis +1

Paper
Add Code

Towards Improved Zero-shot Voice Conversion with Conditional DSVAE

1 code implementation • 11 May 2022 • Jiachen Lian, Chunlei Zhang, Gopala Krishna Anumanchipalli, Dong Yu

In our experiment on the VCTK dataset, we demonstrate that content embeddings derived from the conditional DSVAE overcome the randomness and achieve a much better phoneme classification accuracy, a stabilized vocalization and a better zero-shot VC performance compared with the competitive DSVAE baseline.

Voice Conversion

Paper
Code

Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition

1 code implementation • 1 Apr 2022 • Jiachen Lian, Alan W Black, Louis Goldstein, Gopala Krishna Anumanchipalli

Most of the research on data-driven speech representation learning has focused on raw audios in an end-to-end manner, paying little attention to their internal phonological or gestural structure.

Representation Learning

Paper
Code

Robust Disentangled Variational Speech Representation Learning for Zero-shot Voice Conversion

1 code implementation • 30 Mar 2022 • Jiachen Lian, Chunlei Zhang, Dong Yu

A zero-shot voice conversion is performed by feeding an arbitrary speaker embedding and content embeddings to the VAE decoder.

Data Augmentation Disentanglement +2

Paper
Code

Masked Proxy Loss For Text-Independent Speaker Verification

1 code implementation • 9 Nov 2020 • Jiachen Lian, Aiswarya Vinod Kumar, Hira Dhamyal, Bhiksha Raj, Rita Singh

We further propose Multinomial Masked Proxy (MMP) loss to leverage the hardness of speaker pairs.

Metric Learning Speaker Recognition +2

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.