Search Results for author: Daniel Michelsanti

Found 8 papers, 1 papers with code

Speech inpainting: Context-based speech synthesis guided by video

no code implementations • 1 Jun 2023 • Juan F. Montesinos, Daniel Michelsanti, Gloria Haro, Zheng-Hua Tan, Jesper Jensen

Audio and visual modalities are inherently connected in speech signals: lip movements and facial expressions are correlated with speech sounds.

speech-recognition Speech Recognition +1

Paper
Add Code

Audio-Visual Speech Inpainting with Deep Learning

no code implementations • 9 Oct 2020 • Giovanni Morrone, Daniel Michelsanti, Zheng-Hua Tan, Jesper Jensen

In this paper, we present a deep-learning-based framework for audio-visual speech inpainting, i. e., the task of restoring the missing parts of an acoustic speech signal from reliable audio context and uncorrupted visual information.

Multi-Task Learning

Paper
Add Code

An Overview of Deep-Learning-Based Audio-Visual Speech Enhancement and Separation

1 code implementation • 21 Aug 2020 • Daniel Michelsanti, Zheng-Hua Tan, Shi-Xiong Zhang, Yong Xu, Meng Yu, Dong Yu, Jesper Jensen

Speech enhancement and speech separation are two related tasks, whose purpose is to extract either one or more target speech signals, respectively, from a mixture of sounds generated by several sources.

Speech Enhancement Speech Separation

195

Paper
Code

Vocoder-Based Speech Synthesis from Silent Videos

no code implementations • 6 Apr 2020 • Daniel Michelsanti, Olga Slizovskaia, Gloria Haro, Emilia Gómez, Zheng-Hua Tan, Jesper Jensen

Both acoustic and visual information influence human perception of speech.

Multi-Task Learning Speech Synthesis

Paper
Add Code

Deep-Learning-Based Audio-Visual Speech Enhancement in Presence of Lombard Effect

no code implementations • 29 May 2019 • Daniel Michelsanti, Zheng-Hua Tan, Sigurdur Sigurdsson, Jesper Jensen

Regarding speech intelligibility, we find a general tendency of the benefit in training the systems with Lombard speech.

Speech Enhancement

Paper
Add Code

Effects of Lombard Reflex on the Performance of Deep-Learning-Based Audio-Visual Speech Enhancement Systems

no code implementations • 15 Nov 2018 • Daniel Michelsanti, Zheng-Hua Tan, Sigurdur Sigurdsson, Jesper Jensen

Humans tend to change their way of speaking when they are immersed in a noisy environment, a reflex known as Lombard effect.

Speech Enhancement

Paper
Add Code

On Training Targets and Objective Functions for Deep-Learning-Based Audio-Visual Speech Enhancement

no code implementations • 15 Nov 2018 • Daniel Michelsanti, Zheng-Hua Tan, Sigurdur Sigurdsson, Jesper Jensen

Audio-visual speech enhancement (AV-SE) is the task of improving speech quality and intelligibility in a noisy environment using audio and visual information from a talker.

Speech Enhancement

Paper
Add Code

Conditional Generative Adversarial Networks for Speech Enhancement and Noise-Robust Speaker Verification

no code implementations • 6 Sep 2017 • Daniel Michelsanti, Zheng-Hua Tan

Improving speech system performance in noisy environments remains a challenging task, and speech enhancement (SE) is one of the effective techniques to solve the problem.

Speaker Verification Speech Enhancement

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.