Search Results for author: Alessandro Ragano

Found 8 papers, 0 papers with code

Dialogue Understandability: Why are we streaming movies with subtitles?

no code implementations22 Mar 2024 Helard Becerra, Alessandro Ragano, Diptasree Debnath, Asad Ullah, Crisron Rudolf Lucas, Martin Walsh, Andrew Hines

Watching movies and TV shows with subtitles enabled is not simply down to audibility or speech intelligibility.

Reduce, Reuse, Recycle: Is Perturbed Data better than Other Language augmentation for Low Resource Self-Supervised Speech Models

no code implementations22 Sep 2023 Asad Ullah, Alessandro Ragano, Andrew Hines

Our findings suggest that for resource constrained languages, in-domain synthetic augmentation can outperform knowledge transfer from accented or other language speech.

Representation Learning Transfer Learning

Learning Music Representations with wav2vec 2.0

no code implementations27 Oct 2022 Alessandro Ragano, Emmanouil Benetos, Andrew Hines

In addition, the results are superior to the pre-trained model on speech embeddings, demonstrating that wav2vec 2. 0 pre-trained on music data can be a promising music representation model.

Music Classification

A Comparison of Deep Learning MOS Predictors for Speech Synthesis Quality

no code implementations5 Apr 2022 Alessandro Ragano, Emmanouil Benetos, Michael Chinen, Helard B. Martinez, Chandan K. A. Reddy, Jan Skoglund, Andrew Hines

In this paper, we evaluate several MOS predictors based on wav2vec 2. 0 and the NISQA speech quality prediction model to explore the role of the training data, the influence of the system type, and the role of cross-domain features in SSL models.

Benchmarking Self-Supervised Learning +1

Exploring the influence of fine-tuning data on wav2vec 2.0 model for blind speech quality prediction

no code implementations5 Apr 2022 Helard Becerra, Alessandro Ragano, Andrew Hines

Further research is needed to evaluate other wav2vec 2. 0 models pre-trained with multi-lingual datasets and to develop prediction models that are more resilient to language diversity.

More for Less: Non-Intrusive Speech Quality Assessment with Limited Annotations

no code implementations19 Aug 2021 Alessandro Ragano, Emmanouil Benetos, Andrew Hines

This paper indicates that multi-task learning combined with feature representations from unlabelled data is a promising approach to deal with the lack of large MOS annotated datasets.

Clustering Deep Clustering +1

Audio Impairment Recognition Using a Correlation-Based Feature Representation

no code implementations22 Mar 2020 Alessandro Ragano, Emmanouil Benetos, Andrew Hines

Audio impairment recognition is based on finding noise in audio files and categorising the impairment type.

Cannot find the paper you are looking for? You can Submit a new open access paper.