Search Results for author: Xiaoxue Gao

Found 7 papers, 1 papers with code

Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks

no code implementations24 Feb 2024 Duo Ma, Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li

In this paper, we investigate a new way to pre-train such a joint speech-text model to learn enhanced speech representations and benefit various speech-related downstream tasks.

Pseudo Label Self-Supervised Learning

Self-Transriber: Few-shot Lyrics Transcription with Self-training

no code implementations18 Nov 2022 Xiaoxue Gao, Xianghu Yue, Haizhou Li

The current lyrics transcription approaches heavily rely on supervised learning with labeled data, but such data are scarce and manual labeling of singing is expensive.

Few-Shot Learning

token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text

no code implementations30 Oct 2022 Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li

Firstly, due to the distinct characteristics between speech and text modalities, where speech is continuous while text is discrete, we first discretize speech into a sequence of discrete speech tokens to solve the modality mismatch problem.

intent-classification Intent Classification +1

PoLyScriber: Integrated Fine-tuning of Extractor and Lyrics Transcriber for Polyphonic Music

no code implementations15 Jul 2022 Xiaoxue Gao, Chitralekha Gupta, Haizhou Li

Lyrics transcription of polyphonic music is challenging as the background music affects lyrics intelligibility.

Music-robust Automatic Lyrics Transcription of Polyphonic Music

1 code implementation7 Apr 2022 Xiaoxue Gao, Chitralekha Gupta, Haizhou Li

To improve the robustness of lyrics transcription to the background music, we propose a strategy of combining the features that emphasize the singing vocals, i. e. music-removed features that represent singing vocal extracted features, and the features that capture the singing vocals as well as the background music, i. e. music-present features.

Automatic Lyrics Transcription Language Modelling

Genre-conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music

no code implementations7 Apr 2022 Xiaoxue Gao, Chitralekha Gupta, Haizhou Li

Lyrics transcription of polyphonic music is challenging not only because the singing vocals are corrupted by the background music, but also because the background music and the singing style vary across music genres, such as pop, metal, and hip hop, which affects lyrics intelligibility of the song in different ways.

Automatic Lyrics Transcription

Cannot find the paper you are looking for? You can Submit a new open access paper.