no code implementations • 14 Sep 2023 • Guanlong Zhao, Yongqiang Wang, Jason Pelecanos, Yu Zhang, Hank Liao, Yiling Huang, Han Lu, Quan Wang
We show that the USM-SCD model can achieve more than 75% average speaker change detection F1 score across a test set that consists of data from 96 languages.
1 code implementation • 10 Mar 2022 • Jason Pelecanos, Quan Wang, Yiling Huang, Ignacio Lopez Moreno
This paper presents a novel study of parameter-free attentive scoring for speaker verification.
1 code implementation • 24 Feb 2022 • Quan Wang, Yang Yu, Jason Pelecanos, Yiling Huang, Ignacio Lopez Moreno
In this paper, we introduce a novel language identification system based on conformer layers.
1 code implementation • 5 Apr 2021 • Jason Pelecanos, Quan Wang, Ignacio Lopez Moreno
In this work we propose scoring these representations in a way that can capture uncertainty, enroll/test asymmetry and additional non-linear information.
1 code implementation • 5 Apr 2021 • Roza Chojnacka, Jason Pelecanos, Quan Wang, Ignacio Lopez Moreno
To the best of our knowledge, this is the first study of speaker verification systems at the scale of 46 languages.
no code implementations • 24 Nov 2020 • Yiling Huang, Yutian Chen, Jason Pelecanos, Quan Wang
In recent years, Text-To-Speech (TTS) has been used as a data augmentation technique for speech recognition to help complement inadequacies in the training data.
2 code implementations • 9 Sep 2020 • Quan Wang, Ignacio Lopez Moreno, Mert Saglam, Kevin Wilson, Alan Chiao, Renjie Liu, Yanzhang He, Wei Li, Jason Pelecanos, Marily Nika, Alexander Gruenstein
We introduce VoiceFilter-Lite, a single-channel source separation model that runs on the device to preserve only the speech signals from a target user, as part of a streaming speech recognition system.
no code implementations • 5 May 2016 • Seyed Omid Sadjadi, Jason Pelecanos, Sriram Ganapathy
We present the recent advances along with an error analysis of the IBM speaker recognition system for conversational speech.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2