no code implementations • 14 Sep 2022 • Michael Chinen, Jan Skoglund, Chandan K A Reddy, Alessandro Ragano, Andrew Hines
Non-reference speech quality models are important for a growing number of applications.
no code implementations • 5 Jul 2022 • Ali Siahkoohi, Michael Chinen, Tom Denton, W. Bastiaan Kleijn, Jan Skoglund
Our numerical experiments show that supplementing the convolutional encoder of a neural speech codec with Transformer speech embeddings yields a speech codec with a bitrate of $600\,\mathrm{bps}$ that outperforms the original neural speech codec in synthesized speech quality when trained at the same bitrate.
no code implementations • 5 Apr 2022 • Alessandro Ragano, Emmanouil Benetos, Michael Chinen, Helard B. Martinez, Chandan K. A. Reddy, Jan Skoglund, Andrew Hines
In this paper, we evaluate several MOS predictors based on wav2vec 2. 0 and the NISQA speech quality prediction model to explore the role of the training data, the influence of the system type, and the role of cross-domain features in SSL models.
2 code implementations • 20 Feb 2021 • Wissam A. Jassim, Jan Skoglund, Michael Chinen, Andrew Hines
Good speech quality has been achieved using waveform matching and parametric reconstruction coders.
1 code implementation • 18 Feb 2021 • W. Bastiaan Kleijn, Andrew Storus, Michael Chinen, Tom Denton, Felicia S. C. Lim, Alejandro Luebs, Jan Skoglund, Hengchin Yeh
We introduce predictive-variance regularization to reduce the sensitivity to outliers, resulting in a significant increase in performance.
no code implementations • 20 Nov 2018 • Scott Wisdom, John R. Hershey, Kevin Wilson, Jeremy Thorpe, Michael Chinen, Brian Patton, Rif A. Saurous
Furthermore, the only previous approaches that apply mixture consistency use real-valued masks; mixture consistency has been ignored for complex-valued masks.
Sound Audio and Speech Processing