no code implementations • 2 Oct 2023 • Samuele Cornell, Jee-weon Jung, Shinji Watanabe, Stefano Squartini
This paper presents a novel framework for joint speaker diarization (SD) and automatic speech recognition (ASR), named SLIDAR (sliding-window diarization-augmented recognition).
Automatic Speech Recognition Automatic Speech Recognition (ASR) +4
no code implementations • 31 Jul 2023 • Valeria Bruschi, Michela Cantarini, Luca Serafini, Stefano Nobili, Stefania Cecchi, Stefano Squartini
Snoring is a common disorder that affects people's social and marital lives.
1 code implementation • 28 Jul 2023 • Carlo Aironi, Samuele Cornell, Luca Serafini, Stefano Squartini
Packet loss is a major cause of voice quality degradation in VoIP transmissions with serious impact on intelligibility and user experience.
no code implementations • 23 Jun 2023 • Samuele Cornell, Matthew Wiesner, Shinji Watanabe, Desh Raj, Xuankai Chang, Paola Garcia, Matthew Maciejewski, Yoshiki Masuyama, Zhong-Qiu Wang, Stefano Squartini, Sanjeev Khudanpur
The CHiME challenges have played a significant role in the development and evaluation of robust automatic speech recognition (ASR) systems.
Automatic Speech Recognition Automatic Speech Recognition (ASR) +2
no code implementations • 29 May 2023 • Luca Serafini, Samuele Cornell, Giovanni Morrone, Enrico Zovato, Alessio Brutti, Stefano Squartini
We found that, among all methods considered, EEND-vector clustering (EEND-VC) offers the best trade-off in terms of computing requirements and performance.
no code implementations • 21 Mar 2023 • Giovanni Morrone, Samuele Cornell, Luca Serafini, Enrico Zovato, Alessio Brutti, Stefano Squartini
Recent works show that speech separation guided diarization (SSGD) is an increasingly promising direction, mainly thanks to the recent progress in speech separation.
no code implementations • 31 May 2022 • Giovanni Morrone, Samuele Cornell, Enrico Zovato, Alessio Brutti, Stefano Squartini
Continuous speech separation (CSS) is a recently proposed framework which aims at separating each speaker from an input mixture signal in a streaming fashion.
1 code implementation • 5 Apr 2022 • Giovanni Morrone, Samuele Cornell, Desh Raj, Luca Serafini, Enrico Zovato, Alessio Brutti, Stefano Squartini
In particular, we compare two low-latency speech separation models.
no code implementations • 8 Nov 2021 • Samuele Cornell, Manuel Pariente, François Grondin, Stefano Squartini
We perform a detailed analysis using the recent Clarity Challenge data and show that by using learnt filterbanks it is possible to surpass oracle-mask based beamforming for short windows.
1 code implementation • 5 Oct 2021 • Giovanni Pepe, Leonardo Gabrielli, Stefano Squartini, Carlo Tripodi, Nicolò Strozzi
This paper describes a novel Deep Learning method for the design of IIR parametric filters for automatic audio equalization.
no code implementations • 2021 29th European Signal Processing Conference (EUSIPCO) 2021 • Carlo Aironi, Samuele Cornell, Emanuele Principi, Stefano Squartini
In recent years there has been a considerable rise in interest towards Graph Representation and Learning techniques, especially in such cases where data has intrinsically a graph- like structure: social networks, molecular lattices, or semantic interactions, just to name a few.
1 code implementation • 6 Apr 2021 • Samuele Cornell, Alessio Brutti, Marco Matassoni, Stefano Squartini
Fully exploiting ad-hoc microphone networks for distant speech recognition is still an open issue.
no code implementations • 12 Nov 2020 • Andrea Castellani, Sebastian Schmitt, Stefano Squartini
The approaches make use of a Digital Twin to generate a training dataset which simulates the normal operation of the machinery, along with a small set of labeled anomalous measurement from the real machinery.
no code implementations • 6 Nov 2019 • Md Sahidullah, Jose Patino, Samuele Cornell, Ruiqing Yin, Sunit Sivasankaran, Hervé Bredin, Pavel Korshunov, Alessio Brutti, Romain Serizel, Emmanuel Vincent, Nicholas Evans, Sébastien Marcel, Stefano Squartini, Claude Barras
This paper describes the speaker diarization systems developed for the Second DIHARD Speech Diarization Challenge (DIHARD II) by the Speed team.
1 code implementation • 23 Feb 2019 • Michele DIncecco, Stefano Squartini, Mingjun Zhong
It is not clear if the method could be generalised or transferred to different domains, e. g., the test data were drawn from a different country comparing to the training data.
1 code implementation • 15 Oct 2018 • Fabio Vesperini, Leonardo Gabrielli, Emanuele Principi, Stefano Squartini
Artificial sound event detection (SED) has the aim to mimic the human ability to perceive and understand what is happening in the surroundings.
no code implementations • 14 Sep 2018 • Leonardo Gabrielli, Stefano Tomassetti, Stefano Squartini, Carlo Zinato, Stefano Guaiana
In this work we refine previous results by introducing the former approach in a multi-stage algorithm that also adds heuristics and a stochastic optimization method operating on objective cost functions based on psychoacoustics.