no code implementations • 22 Mar 2024 • Helard Becerra, Alessandro Ragano, Diptasree Debnath, Asad Ullah, Crisron Rudolf Lucas, Martin Walsh, Andrew Hines
Watching movies and TV shows with subtitles enabled is not simply down to audibility or speech intelligibility.
no code implementations • 22 Sep 2023 • Asad Ullah, Alessandro Ragano, Andrew Hines
Our findings suggest that for resource constrained languages, in-domain synthetic augmentation can outperform knowledge transfer from accented or other language speech.
no code implementations • 27 Oct 2022 • Alessandro Ragano, Emmanouil Benetos, Andrew Hines
In addition, the results are superior to the pre-trained model on speech embeddings, demonstrating that wav2vec 2. 0 pre-trained on music data can be a promising music representation model.
no code implementations • 14 Sep 2022 • Michael Chinen, Jan Skoglund, Chandan K A Reddy, Alessandro Ragano, Andrew Hines
Non-reference speech quality models are important for a growing number of applications.
no code implementations • 5 Apr 2022 • Alessandro Ragano, Emmanouil Benetos, Michael Chinen, Helard B. Martinez, Chandan K. A. Reddy, Jan Skoglund, Andrew Hines
In this paper, we evaluate several MOS predictors based on wav2vec 2. 0 and the NISQA speech quality prediction model to explore the role of the training data, the influence of the system type, and the role of cross-domain features in SSL models.
no code implementations • 5 Apr 2022 • Helard Becerra, Alessandro Ragano, Andrew Hines
Further research is needed to evaluate other wav2vec 2. 0 models pre-trained with multi-lingual datasets and to develop prediction models that are more resilient to language diversity.
no code implementations • 19 Aug 2021 • Alessandro Ragano, Emmanouil Benetos, Andrew Hines
This paper indicates that multi-task learning combined with feature representations from unlabelled data is a promising approach to deal with the lack of large MOS annotated datasets.
no code implementations • 22 Mar 2020 • Alessandro Ragano, Emmanouil Benetos, Andrew Hines
Audio impairment recognition is based on finding noise in audio files and categorising the impairment type.