Search Results for author: Daniel Michelsanti

Found 8 papers, 1 papers with code

Speech inpainting: Context-based speech synthesis guided by video

no code implementations1 Jun 2023 Juan F. Montesinos, Daniel Michelsanti, Gloria Haro, Zheng-Hua Tan, Jesper Jensen

Audio and visual modalities are inherently connected in speech signals: lip movements and facial expressions are correlated with speech sounds.

speech-recognition Speech Recognition +1

Audio-Visual Speech Inpainting with Deep Learning

no code implementations9 Oct 2020 Giovanni Morrone, Daniel Michelsanti, Zheng-Hua Tan, Jesper Jensen

In this paper, we present a deep-learning-based framework for audio-visual speech inpainting, i. e., the task of restoring the missing parts of an acoustic speech signal from reliable audio context and uncorrupted visual information.

Multi-Task Learning

An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation

1 code implementation21 Aug 2020 Daniel Michelsanti, Zheng-Hua Tan, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu, Jesper Jensen

Speech enhancement and speech separation are two related tasks, whose purpose is to extract either one or more target speech signals, respectively, from a mixture of sounds generated by several sources.

Speech Enhancement Speech Separation

Deep-Learning-Based Audio-Visual Speech Enhancement in Presence of Lombard Effect

no code implementations29 May 2019 Daniel Michelsanti, Zheng-Hua Tan, Sigurdur Sigurdsson, Jesper Jensen

Regarding speech intelligibility, we find a general tendency of the benefit in training the systems with Lombard speech.

Speech Enhancement

Effects of Lombard Reflex on the Performance of Deep-Learning-Based Audio-Visual Speech Enhancement Systems

no code implementations15 Nov 2018 Daniel Michelsanti, Zheng-Hua Tan, Sigurdur Sigurdsson, Jesper Jensen

Humans tend to change their way of speaking when they are immersed in a noisy environment, a reflex known as Lombard effect.

Speech Enhancement

On Training Targets and Objective Functions for Deep-Learning-Based Audio-Visual Speech Enhancement

no code implementations15 Nov 2018 Daniel Michelsanti, Zheng-Hua Tan, Sigurdur Sigurdsson, Jesper Jensen

Audio-visual speech enhancement (AV-SE) is the task of improving speech quality and intelligibility in a noisy environment using audio and visual information from a talker.

Speech Enhancement

Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification

no code implementations6 Sep 2017 Daniel Michelsanti, Zheng-Hua Tan

Improving speech system performance in noisy environments remains a challenging task, and speech enhancement (SE) is one of the effective techniques to solve the problem.

Speaker Verification Speech Enhancement

Cannot find the paper you are looking for? You can Submit a new open access paper.