Search Results for author: Qiongqiong Wang

Found 15 papers, 3 papers with code

Contextual Paralinguistic Data Creation for Multi-Modal Speech-LLM: Data Condensation and Spoken QA Generation

no code implementations19 May 2025 Qiongqiong Wang, Hardik B. Sailor, Tianchi Liu, Ai Ti Aw

Current speech-LLMs exhibit limited capability in contextual reasoning alongside paralinguistic understanding, primarily due to the lack of Question-Answer (QA) datasets that cover both aspects.

Dataset Generation

Towards Quantifying and Reducing Language Mismatch Effects in Cross-Lingual Speech Anti-Spoofing

no code implementations12 Sep 2024 Tianchi Liu, Ivan Kukanov, Zihan Pan, Qiongqiong Wang, Hardik B. Sailor, Kong Aik Lee

The effects of language mismatch impact speech anti-spoofing systems, while investigations and quantification of these effects remain limited.

Speech Foundation Model Ensembles for the Controlled Singing Voice Deepfake Detection (CtrSVDD) Challenge 2024

2 code implementations3 Sep 2024 Anmol Guragain, Tianchi Liu, Zihan Pan, Hardik B. Sailor, Qiongqiong Wang

This work details our approach to achieving a leading system with a 1. 79% pooled equal error rate (EER) on the evaluation set of the Controlled Singing Voice Deepfake Detection (CtrSVDD).

DeepFake Detection Face Swapping +1

Attentive Merging of Hidden Embeddings from Pre-trained Speech Model for Anti-spoofing Detection

1 code implementation12 Jun 2024 Zihan Pan, Tianchi Liu, Hardik B. Sailor, Qiongqiong Wang

Self-supervised learning (SSL) speech representation models, trained on large speech corpora, have demonstrated effectiveness in extracting hierarchical speech embeddings through multiple transformer layers.

Computational Efficiency Self-Supervised Learning

Cosine Scoring with Uncertainty for Neural Speaker Embedding

no code implementations11 Mar 2024 Qiongqiong Wang, Kong Aik Lee

Uncertainty modeling in speaker representation aims to learn the variability present in speech utterances.

Speaker Recognition

Golden Gemini is All You Need: Finding the Sweet Spots for Speaker Verification

1 code implementation6 Dec 2023 Tianchi Liu, Kong Aik Lee, Qiongqiong Wang, Haizhou Li

We represent the stride space on a trellis diagram, and conduct a systematic study on the impact of temporal and frequency resolutions on the performance and further identify two optimal points, namely Golden Gemini, which serves as a guiding principle for designing 2D ResNet-based speaker verification models.

All Speaker Verification

Generalized domain adaptation framework for parametric back-end in speaker recognition

no code implementations24 May 2023 Qiongqiong Wang, Koji Okabe, Kong Aik Lee, Takafumi Koshinaka

The efficacy of the proposed techniques has been experimentally validated on NIST 2016, 2018, and 2019 Speaker Recognition Evaluation (SRE'16, SRE'18, and SRE'19) datasets.

Speaker Recognition Unsupervised Domain Adaptation

Incorporating Uncertainty from Speaker Embedding Estimation to Speaker Verification

no code implementations23 Feb 2023 Qiongqiong Wang, Kong Aik Lee, Tianchi Liu

We propose a log-likelihood ratio function for the PLDA scoring with the uncertainty propagation.

Speaker Verification

Scoring of Large-Margin Embeddings for Speaker Verification: Cosine or PLDA?

no code implementations8 Apr 2022 Qiongqiong Wang, Kong Aik Lee, Tianchi Liu

The emergence of large-margin softmax cross-entropy losses in training deep speaker embedding neural networks has triggered a gradual shift from parametric back-ends to a simpler cosine similarity measure for speaker verification.

Speaker Verification

Task-aware Warping Factors in Mask-based Speech Enhancement

no code implementations27 Aug 2021 Qiongqiong Wang, Kong Aik Lee, Takafumi Koshinaka, Koji Okabe, Hitoshi Yamamoto

It is easy to apply the proposed dual-warping factors approach to any mask-based SE method, and it allows a single SE system to handle multiple tasks without task-dependent training.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Xi-Vector Embedding for Speaker Recognition

no code implementations12 Aug 2021 Kong Aik Lee, Qiongqiong Wang, Takafumi Koshinaka

We present a Bayesian formulation for deep speaker embedding, wherein the xi-vector is the Bayesian counterpart of the x-vector, taking into account the uncertainty estimate.

Speaker Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.