Search Results for author: Sharon Gannot

Found 17 papers, 3 papers with code

dEchorate: a Calibrated Room Impulse Response Database for Echo-aware Signal Processing

2 code implementations27 Apr 2021 Diego Di Carlo, Pinchas Tandeitnik, Cédric Foy, Antoine Deleforge, Nancy Bertin, Sharon Gannot

This paper presents dEchorate: a new database of measured multichannel Room Impulse Responses (RIRs) including annotations of early echo timings and 3D positions of microphones, real sources and image sources under different wall configurations in a cuboid room.

Benchmarking Retrieval +2

LipVoicer: Generating Speech from Silent Videos Guided by Lip Reading

1 code implementation5 Jun 2023 Yochai Yemini, Aviv Shamsian, Lior Bracha, Sharon Gannot, Ethan Fetaya

We then condition a diffusion model on the video and use the extracted text through a classifier-guidance mechanism where a pre-trained ASR serves as the classifier.

Lip Reading

FCN Approach for Dynamically Locating Multiple Speakers

1 code implementation26 Aug 2020 Hodaya Hammer, Shlomo E. Chazan, Jacob Goldberger, Sharon Gannot

In this paper, we present a deep neural network-based online multi-speaker localisation algorithm.

Deep Clustering Based on a Mixture of Autoencoders

no code implementations16 Dec 2018 Shlomo E. Chazan, Sharon Gannot, Jacob Goldberger

The optimal clustering is found by minimizing the reconstruction loss of the mixture of autoencoder network.

Clustering Deep Clustering

Machine learning in acoustics: theory and applications

no code implementations11 May 2019 Michael J. Bianco, Peter Gerstoft, James Traer, Emma Ozanich, Marie A. Roch, Sharon Gannot, Charles-Alban Deledalle

Acoustic data provide scientific and engineering insights in fields ranging from biology and communications to ocean and Earth science.

BIG-bench Machine Learning

Semi-supervised source localization with deep generative modeling

no code implementations27 May 2020 Michael J. Bianco, Sharon Gannot, Peter Gerstoft

We propose a semi-supervised localization approach based on deep generative modeling with variational autoencoders (VAEs).

Dynamically locating multiple speakers based on the time-frequency domain

no code implementations1 Jan 2021 Hodaya Hammer, Shlomo Chazan, Jacob Goldberger, Sharon Gannot

In this study we present a deep neural network-based online multi-speaker localisation algorithm based on a multi-microphone array.

Scene-Agnostic Multi-Microphone Speech Dereverberation

no code implementations22 Oct 2020 Yochai Yemini, Ethan Fetaya, Haggai Maron, Sharon Gannot

We use noisy and noiseless versions of a simulated reverberant dataset to test the proposed architecture.

Position Speech Dereverberation

Binaural LCMV Beamforming with Partial Noise Estimation

no code implementations10 May 2019 Nico Gößling, Elior Hadad, Sharon Gannot, Simon Doclo

While the binaural minimum variance distortionless response (BMVDR) beamformer provides a good noise reduction performance and preserves the binaural cues of the desired source, it does not allow to control the reduction of the interfering sources and distorts the binaural cues of the interfering sources and the background noise.

Noise Estimation

Semi-supervised source localization in reverberant environments with deep generative modeling

no code implementations26 Jan 2021 Michael J. Bianco, Sharon Gannot, Efren Fernandez-Grande, Peter Gerstoft

As far as we are aware, our paper presents the first approach to modeling the physics of acoustic propagation using deep generative modeling.

Speech enhancement with mixture-of-deep-experts with clean clustering pre-training

no code implementations11 Feb 2021 Shlomo E. Chazan, Jacob Goldberger, Sharon Gannot

The experts estimate a mask from the noisy input and the final mask is then obtained as a weighted average of the experts' estimates, with the weights determined by the gating DNN.

Clustering Speech Enhancement

Single microphone speaker extraction using unified time-frequency Siamese-Unet

no code implementations6 Mar 2022 Aviad Eisenberg, Sharon Gannot, Shlomo E. Chazan

In this paper we present a unified time-frequency method for speaker extraction in clean and noisy conditions.

blind source separation

Unsupervised Acoustic Scene Mapping Based on Acoustic Features and Dimensionality Reduction

no code implementations1 Jan 2023 Idan Cohen, Ofir Lindenbaum, Sharon Gannot

Classical methods for acoustic scene mapping require the estimation of time difference of arrival (TDOA) between microphones.

Dimensionality Reduction

Comparison of Frequency-Fusion Mechanisms for Binaural Direction-of-Arrival Estimation for Multiple Speakers

no code implementations15 Jan 2024 Daniel Fejgin, Elior Hadad, Sharon Gannot, Zbyněk Koldovský, Simon Doclo

According to how the SPS are combined, frequency fusion mechanisms are categorized into narrowband, broadband, or speaker-grouped, where the latter mechanism requires a speaker-wise grouping of frequencies.

Direction of Arrival Estimation

Concurrent Speaker Detection: A multi-microphone Transformer-Based Approach

no code implementations11 Mar 2024 Amit Eliav, Sharon Gannot

We present a deep-learning approach for the task of Concurrent Speaker Detection (CSD) using a modified transformer model.

Cannot find the paper you are looking for? You can Submit a new open access paper.