Search Results for author: Michael Chinen

Found 6 papers, 2 papers with code

Ultra-Low-Bitrate Speech Coding with Pretrained Transformers

no code implementations5 Jul 2022 Ali Siahkoohi, Michael Chinen, Tom Denton, W. Bastiaan Kleijn, Jan Skoglund

Our numerical experiments show that supplementing the convolutional encoder of a neural speech codec with Transformer speech embeddings yields a speech codec with a bitrate of $600\,\mathrm{bps}$ that outperforms the original neural speech codec in synthesized speech quality when trained at the same bitrate.

Inductive Bias

A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality

no code implementations5 Apr 2022 Alessandro Ragano, Emmanouil Benetos, Michael Chinen, Helard B. Martinez, Chandan K. A. Reddy, Jan Skoglund, Andrew Hines

In this paper, we evaluate several MOS predictors based on wav2vec 2. 0 and the NISQA speech quality prediction model to explore the role of the training data, the influence of the system type, and the role of cross-domain features in SSL models.

Benchmarking Self-Supervised Learning +1

WARP-Q: Quality Prediction For Generative Neural Speech Codecs

2 code implementations20 Feb 2021 Wissam A. Jassim, Jan Skoglund, Michael Chinen, Andrew Hines

Good speech quality has been achieved using waveform matching and parametric reconstruction coders.

Dynamic Time Warping

Generative Speech Coding with Predictive Variance Regularization

1 code implementation18 Feb 2021 W. Bastiaan Kleijn, Andrew Storus, Michael Chinen, Tom Denton, Felicia S. C. Lim, Alejandro Luebs, Jan Skoglund, Hengchin Yeh

We introduce predictive-variance regularization to reduce the sensitivity to outliers, resulting in a significant increase in performance.

Differentiable Consistency Constraints for Improved Deep Speech Enhancement

no code implementations20 Nov 2018 Scott Wisdom, John R. Hershey, Kevin Wilson, Jeremy Thorpe, Michael Chinen, Brian Patton, Rif A. Saurous

Furthermore, the only previous approaches that apply mixture consistency use real-valued masks; mixture consistency has been ignored for complex-valued masks.

Sound Audio and Speech Processing

Cannot find the paper you are looking for? You can Submit a new open access paper.