1 code implementation • 23 Sep 2024 • Yuanchao Li, Azalea Gui, Dimitra Emmanouilidou, Hannes Gamper
In this work, we conduct a study on Music Emotion Recognition (MER) and Emotional Music Generation (EMG), employing diverse audio encoders alongside the Frechet Audio Distance (FAD), a reference-free evaluation metric.
1 code implementation • 20 Jul 2024 • Sebastian Braun, Hannes Gamper
We propose a novel training scheme using self-label correction and data augmentation methods designed to deal with noisy labels and improve real-world accuracy on a polyphonic audio content detection task.
1 code implementation • 29 May 2024 • Eloi Moliner, Sebastian Braun, Hannes Gamper
This paper investigates the potential of Gaussian Flow Bridges, an emerging approach in generative modeling, for this problem.
1 code implementation • 1 Feb 2024 • Soham Deshmukh, Dareen Alharthi, Benjamin Elizalde, Hannes Gamper, Mahmoud Al Ismail, Rita Singh, Bhiksha Raj, Huaming Wang
Here, we exploit this capability and introduce PAM, a no-reference metric for assessing audio quality for different audio processing tasks.
no code implementations • 8 Dec 2023 • Ruihan Yang, Hannes Gamper, Sebastian Braun
We introduce a multi-modal diffusion model tailored for the bi-directional conditional generation of video and audio.
3 code implementations • 2 Nov 2023 • Azalea Gui, Hannes Gamper, Sebastian Braun, Dimitra Emmanouilidou
The growing popularity of generative music models underlines the need for perceptually relevant, objective music quality metrics.
1 code implementation • 22 Sep 2023 • Ross Cutler, Ando Saabas, Tanel Parnamaa, Marju Purin, Evgenii Indenbom, Nicolae-Catalin Ristea, Jegor Gužvin, Hannes Gamper, Sebastian Braun, Robert Aichner
This is the fourth AEC challenge and it is enhanced by adding a second track for personalized acoustic echo cancellation, reducing the algorithmic + buffering latency to 20ms, as well as including a full-band version of AECMOS.
no code implementations • 4 Dec 2022 • Haleh Akrami, Hannes Gamper
The mean opinion score (MOS) is standardized for the perceptual evaluation of speech quality and is obtained by asking listeners to rate the quality of a speech sample.
1 code implementation • 27 Feb 2022 • Harishchandra Dubey, Vishak Gopal, Ross Cutler, Ashkan Aazami, Sergiy Matusevych, Sebastian Braun, Sefik Emre Eskimez, Manthan Thakker, Takuya Yoshioka, Hannes Gamper, Robert Aichner
We open-source datasets and test sets for researchers to train their deep noise suppression models, as well as a subjective evaluation framework based on ITU-T P. 835 to rate and rank-order the challenge entries.
1 code implementation • 27 Feb 2022 • Ross Cutler, Ando Saabas, Tanel Parnamaa, Marju Purin, Hannes Gamper, Sebastian Braun, Karsten Sørensen, Robert Aichner
This is the third AEC challenge and it is enhanced by including mobile scenarios, adding speech recognition rate in the challenge goal metrics, and making the default sample rate 48 kHz.
no code implementations • 23 Nov 2021 • Sebastian Braun, Hannes Gamper
Deep learning based speech enhancement has made rapid development towards improving quality, while models are becoming more compact and usable for real-time on-the-edge inference.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 22 Jan 2021 • Sebastian Braun, Hannes Gamper, Chandan K. A. Reddy, Ivan Tashev
It is shown that the achievable speech quality is a function of network complexity, and show which models have better tradeoffs.
2 code implementations • 6 Jan 2021 • Chandan K A Reddy, Harishchandra Dubey, Kazuhito Koishida, Arun Nair, Vishak Gopal, Ross Cutler, Sebastian Braun, Hannes Gamper, Robert Aichner, Sriram Srinivasan
In this version of the challenge organized at INTERSPEECH 2021, we are expanding both our training and test datasets to accommodate full band scenarios.
1 code implementation • 10 Sep 2020 • Kusha Sridhar, Ross Cutler, Ando Saabas, Tanel Parnamaa, Hannes Gamper, Sebastian Braun, Robert Aichner, Sriram Srinivasan
In this challenge, we open source two large datasets to train AEC models under both single talk and double talk scenarios.
Acoustic echo cancellation Audio and Speech Processing Sound
1 code implementation • 30 Oct 2019 • Ziqi Fan, Vibhav Vineet, Hannes Gamper, Nikunj Raghuvanshi
Diffracted scattering and occlusion are important acoustic effects in interactive auralization and noise control applications, typically requiring expensive numerical simulation.