no code implementations • 14 Sep 2022 • Michael Chinen, Jan Skoglund, Chandan K A Reddy, Alessandro Ragano, Andrew Hines
Non-reference speech quality models are important for a growing number of applications.
no code implementations • 5 Jul 2022 • Ali Siahkoohi, Michael Chinen, Tom Denton, W. Bastiaan Kleijn, Jan Skoglund
Our numerical experiments show that supplementing the convolutional encoder of a neural speech codec with Transformer speech embeddings yields a speech codec with a bitrate of $600\,\mathrm{bps}$ that outperforms the original neural speech codec in synthesized speech quality when trained at the same bitrate.
no code implementations • 5 Apr 2022 • Alessandro Ragano, Emmanouil Benetos, Michael Chinen, Helard B. Martinez, Chandan K. A. Reddy, Jan Skoglund, Andrew Hines
In this paper, we evaluate several MOS predictors based on wav2vec 2. 0 and the NISQA speech quality prediction model to explore the role of the training data, the influence of the system type, and the role of cross-domain features in SSL models.
5 code implementations • 7 Jul 2021 • Neil Zeghidour, Alejandro Luebs, Ahmed Omran, Jan Skoglund, Marco Tagliasacchi
We present SoundStream, a novel neural audio codec that can efficiently compress speech, music and general audio at bitrates normally targeted by speech-tailored codecs.
1 code implementation • 23 Feb 2021 • Tom Denton, Alejandro Luebs, Felicia S. C. Lim, Andrew Storus, Hengchin Yeh, W. Bastiaan Kleijn, Jan Skoglund
Recent advances in neural-network based generative modeling of speech has shown great potential for speech coding.
2 code implementations • 20 Feb 2021 • Wissam A. Jassim, Jan Skoglund, Michael Chinen, Andrew Hines
Good speech quality has been achieved using waveform matching and parametric reconstruction coders.
1 code implementation • 18 Feb 2021 • W. Bastiaan Kleijn, Andrew Storus, Michael Chinen, Tom Denton, Felicia S. C. Lim, Alejandro Luebs, Jan Skoglund, Hengchin Yeh
We introduce predictive-variance regularization to reduce the sensitivity to outliers, resulting in a significant increase in performance.
2 code implementations • 28 Mar 2019 • Jean-Marc Valin, Jan Skoglund
We demonstrate that LPCNet operating at 1. 6 kb/s achieves significantly higher quality than MELP and that uncompressed LPCNet can exceed the quality of a waveform codec operating at low bitrate.
2 code implementations • 28 Oct 2018 • Jean-Marc Valin, Jan Skoglund
We demonstrate that LPCNet can achieve significantly higher quality than WaveRNN for the same network size and that high quality LPCNet speech synthesis is achievable with a complexity under 3 GFLOPS.
1 code implementation • 1 Dec 2017 • W. Bastiaan Kleijn, Felicia S. C. Lim, Alejandro Luebs, Jan Skoglund, Florian Stimberg, Quan Wang, Thomas C. Walters
Traditional parametric coding of speech facilitates low rate but provides poor reconstruction quality because of the inadequacy of the model used.