Search Results for author: JiLong Wu

Found 8 papers, 1 papers with code

Get Large Language Models Ready to Speak: A Late-fusion Approach for Speech Generation

no code implementations27 Oct 2024 Maohao Shen, Shun Zhang, JiLong Wu, Zhiping Xiu, Ehab AlBadawy, Yiting Lu, Mike Seltzer, Qing He

Finally, we further explore MoLE-Llama in text-in-speech-out QA tasks, demonstrating its great potential as a multimodal dialog system capable of speech generation.

parameter-efficient fine-tuning Question Answering +2

Self-Supervised Representations for Singing Voice Conversion

no code implementations21 Mar 2023 Tejas Jayashankar, JiLong Wu, Leda Sari, David Kant, Vimal Manohar, Qing He

A singing voice conversion model converts a song in the voice of an arbitrary source singer to the voice of a target singer.

Disentanglement Voice Conversion

Voice-preserving Zero-shot Multiple Accent Conversion

no code implementations23 Nov 2022 Mumin Jin, Prashant Serai, JiLong Wu, Andros Tjandra, Vimal Manohar, Qing He

Most people who have tried to learn a foreign language would have experienced difficulties understanding or speaking with a native speaker's accent.

VocBench: A Neural Vocoder Benchmark for Speech Synthesis

1 code implementation6 Dec 2021 Ehab A. AlBadawy, Andrew Gibiansky, Qing He, JiLong Wu, Ming-Ching Chang, Siwei Lyu

We perform a subjective and objective evaluation to compare the performance of each vocoder along a different axis.

Speech Synthesis

Multi-rate attention architecture for fast streamable Text-to-speech spectrum modeling

no code implementations1 Apr 2021 Qing He, Zhiping Xiu, Thilo Koehler, JiLong Wu

Typical high quality text-to-speech (TTS) systems today use a two-stage architecture, with a spectrum model stage that generates spectral frames and a vocoder stage that generates the actual audio.

Decoder Text to Speech

Cannot find the paper you are looking for? You can Submit a new open access paper.