no code implementations • 28 Aug 2024 • John Janiczek, Dading Chong, Dongyang Dai, Arlo Faria, Chao Wang, Tao Wang, Yuzong Liu
The discriminator is used in a training pipeline that improves both the acoustic and prosodic features of a TTS model.
no code implementations • 6 Jul 2023 • Gene-Ping Yang, Yue Gu, Qingming Tang, Dongsu Du, Yuzong Liu
Our approach used a teacher-student framework to transfer knowledge from a larger, more complex model to a smaller, light-weight model using dual-view cross-correlation distillation and the teacher's codebook as learning objectives.
no code implementations • 21 Apr 2023 • Zuhaib Akhtar, Mohammad Omar Khursheed, Dongsu Du, Yuzong Liu
In this work, we present Slimmable Neural Networks applied to the problem of small-footprint keyword spotting.
no code implementations • 7 Mar 2023 • Chenyang Gao, Yue Gu, Francesco Caliva, Yuzong Liu
Self-supervised speech representation learning (S3RL) is revolutionizing the way we leverage the ever-growing availability of data.
no code implementations • 4 Mar 2023 • Sashank Macha, Om Oza, Alex Escott, Francesco Caliva, Robbie Armitano, Santosh Kumar Cheekatmalla, Sree Hari Krishnan Parthasarathi, Yuzong Liu
Furthermore, on an in-house KWS dataset, we show that our 8-bit FXP-QAT models have a 4-6% improvement in relative false discovery rate at fixed false reject rate compared to full precision FLP models.
no code implementations • 13 Jul 2022 • Lu Zeng, Sree Hari Krishnan Parthasarathi, Yuzong Liu, Alex Escott, Santosh Kumar Cheekatmalla, Nikko Strom, Shiv Vitaladevuni
We organize our results in two embedded chipset settings: a) with commodity ARM NEON instruction set and 8-bit containers, we present accuracy, CPU, and memory results using sub 8-bit weights (4, 5, 8-bit) and 8-bit quantization of rest of the network; b) with off-the-shelf neural network accelerators, for a range of weight bit widths (1 and 5-bit), while presenting accuracy results, we project reduction in memory utilization.
1 code implementation • 11 Dec 2020 • Shaoshi Ling, Yuzong Liu
In speech representation learning, a large amount of unlabeled data is used in a self-supervised manner to learn a feature representation.
no code implementations • 30 Nov 2020 • Siddharth Dalmia, Yuzong Liu, Srikanth Ronanki, Katrin Kirchhoff
We live in a world where 60% of the population can speak two or more languages fluently.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+1
no code implementations • 1 Jun 2020 • Chander Chandak, Zeynab Raeesy, Ariya Rastrow, Yuzong Liu, Xiangyang Huang, Siyu Wang, Dong Kwon Joo, Roland Maas
A common approach to solve multilingual speech recognition is to run multiple monolingual ASR systems in parallel and rely on a language identification (LID) component that detects the input language.
1 code implementation • 3 Dec 2019 • Shaoshi Ling, Yuzong Liu, Julian Salazar, Katrin Kirchhoff
We propose a novel approach to semi-supervised automatic speech recognition (ASR).
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
1 code implementation • 30 Jun 2019 • Shaoshi Ling, Julian Salazar, Yuzong Liu, Katrin Kirchhoff
We introduce BERTphone, a Transformer encoder trained on large speech corpora that outputs phonetically-aware contextual representation vectors that can be used for both speaker and language recognition.
no code implementations • NAACL 2019 • Courtney Mansfield, Ming Sun, Yuzong Liu, G, Ankur he, Bj{\"o}rn Hoffmeister
We find subword models with additional linguistic features yield the best performance (with a word error rate of 0. 17{\%}).
no code implementations • 6 Feb 2019 • Yiming Wang, Xing Fan, I-Fan Chen, Yuzong Liu, Tongfei Chen, Björn Hoffmeister
The anchored segment refers to the wake-up word part of an audio stream, which contains valuable speaker information that can be used to suppress interfering speech and background noise.
no code implementations • NeurIPS 2010 • Yuzong Liu, Mohit Sharma, Charles Gaona, Jonathan Breshears, Jarod Roland, Zachary Freudenburg, Eric Leuthardt, Kilian Q. Weinberger
For successful upper limb BCIs, it is important to decode finger movements from brain activity.