Search Results for author: Enrico Zovato

Found 5 papers, 1 papers with code

An Experimental Review of Speaker Diarization methods with application to Two-Speaker Conversational Telephone Speech recordings

no code implementations • 29 May 2023 • Luca Serafini, Samuele Cornell, Giovanni Morrone, Enrico Zovato, Alessio Brutti, Stefano Squartini

We found that, among all methods considered, EEND-vector clustering (EEND-VC) offers the best trade-off in terms of computing requirements and performance.

Clustering speaker-diarization +4

Paper
Add Code

End-to-End Integration of Speech Separation and Voice Activity Detection for Low-Latency Diarization of Telephone Conversations

no code implementations • 21 Mar 2023 • Giovanni Morrone, Samuele Cornell, Luca Serafini, Enrico Zovato, Alessio Brutti, Stefano Squartini

Finally, we also show that the separated signals can be readily used also for automatic speech recognition, reaching performance close to using oracle sources in some configurations.

Action Detection Activity Detection +4

Paper
Add Code

Conversational Speech Separation: an Evaluation Study for Streaming Applications

no code implementations • 31 May 2022 • Giovanni Morrone, Samuele Cornell, Enrico Zovato, Alessio Brutti, Stefano Squartini

Continuous speech separation (CSS) is a recently proposed framework which aims at separating each speaker from an input mixture signal in a streaming fashion.

Speech Separation

Paper
Add Code

Low-Latency Speech Separation Guided Diarization for Telephone Conversations

1 code implementation • 5 Apr 2022 • Giovanni Morrone, Samuele Cornell, Desh Raj, Luca Serafini, Enrico Zovato, Alessio Brutti, Stefano Squartini

In particular, we compare two low-latency speech separation models.

Action Detection Activity Detection +5

Paper
Code

Voice Cloning: a Multi-Speaker Text-to-Speech Synthesis Approach based on Transfer Learning

no code implementations • 10 Feb 2021 • Giuseppe Ruggiero, Enrico Zovato, Luigi di Caro, Vincent Pollet

This is the main reason why the TTS models are usually single speaker.

Speech Synthesis Text-To-Speech Synthesis +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.