Search Results for author: Gene-Ping Yang

Found 6 papers, 3 papers with code

Towards Matching Phones and Speech Representations

no code implementations • 26 Oct 2023 • Gene-Ping Yang, Hao Tang

We study two key properties that enable matching, namely, whether cluster centroids of self-supervised representations reduce the variability of phone instances and respect the relationship among phones.

Self-Supervised Learning

Paper
Add Code

On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation

no code implementations • 6 Jul 2023 • Gene-Ping Yang, Yue Gu, Qingming Tang, Dongsu Du, Yuzong Liu

Our approach used a teacher-student framework to transfer knowledge from a larger, more complex model to a smaller, light-weight model using dual-view cross-correlation distillation and the teacher's codebook as learning objectives.

Keyword Spotting Knowledge Distillation +1

Paper
Add Code

Supervised Attention in Sequence-to-Sequence Models for Speech Recognition

no code implementations • 25 Apr 2022 • Gene-Ping Yang, Hao Tang

Attention mechanism in sequence-to-sequence models is designed to model the alignments between acoustic features and output tokens in speech recognition.

speech-recognition Speech Recognition

Paper
Add Code

Stabilizing Label Assignment for Speech Separation by Self-supervised Pre-training

1 code implementation • 29 Oct 2020 • Sung-Feng Huang, Shun-Po Chuang, Da-Rong Liu, Yi-Chen Chen, Gene-Ping Yang, Hung-Yi Lee

Speech separation has been well developed, with the very successful permutation invariant training (PIT) approach, although the frequent label assignment switching happening during PIT training remains to be a problem when better convergence speed and achievable performance are desired.

Ranked #6 on Speech Separation on Libri2Mix (using extra training data)

Speaker Separation Speech Enhancement +1

Paper
Code

Interrupted and cascaded permutation invariant training for speech separation

1 code implementation • 28 Oct 2019 • Gene-Ping Yang, Szu-Lin Wu, Yao-Wen Mao, Hung-Yi Lee, Lin-shan Lee

Permutation Invariant Training (PIT) has long been a stepping stone method for training speech separation model in handling the label ambiguity problem.

Ranked #22 on Speech Separation on WSJ0-2mix

Speech Separation

Paper
Code

Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering

1 code implementation • 16 Apr 2019 • Gene-Ping Yang, Chao-I Tuan, Hung-Yi Lee, Lin-shan Lee

Substantial effort has been reported based on approaches over spectrogram, which is well known as the standard time-and-frequency cross-domain representation for speech signals.

Ranked #24 on Speech Separation on WSJ0-2mix

Clustering Speech Separation

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.