Search Results for author: Hakan Erdogan

Found 20 papers, 1 papers with code

Binaural Angular Separation Network

no code implementations16 Jan 2024 Yang Yang, George Sung, Shao-Fu Shih, Hakan Erdogan, Chehung Lee, Matthias Grundmann

We propose a neural network model that can separate target speech sources from interfering sources at different angular regions using two microphones.

Guided Speech Enhancement Network

no code implementations13 Mar 2023 Yang Yang, Shao-Fu Shih, Hakan Erdogan, Jamie Menjay Lin, Chehung Lee, Yunpeng Li, George Sung, Matthias Grundmann

Multi-microphone speech enhancement problem is often decomposed into two decoupled steps: a beamformer that provides spatial filtering and a single-channel speech enhancement model that cleans up the beamformer output.

Denoising Speech Enhancement

CycleGAN-Based Unpaired Speech Dereverberation

no code implementations29 Mar 2022 Hannah Muckenhirn, Aleksandr Safin, Hakan Erdogan, Felix de Chaumont Quitry, Marco Tagliasacchi, Scott Wisdom, John R. Hershey

Typically, neural network-based speech dereverberation models are trained on paired data, composed of a dry utterance and its corresponding reverberant utterance.

Speech Dereverberation

Sparse, Efficient, and Semantic Mixture Invariant Training: Taming In-the-Wild Unsupervised Sound Separation

no code implementations1 Jun 2021 Scott Wisdom, Aren Jansen, Ron J. Weiss, Hakan Erdogan, John R. Hershey

The best performance is achieved using larger numbers of output sources, enabled by our efficient MixIT loss, combined with sparsity losses to prevent over-separation.

What's All the FUSS About Free Universal Sound Separation Data?

no code implementations2 Nov 2020 Scott Wisdom, Hakan Erdogan, Daniel Ellis, Romain Serizel, Nicolas Turpault, Eduardo Fonseca, Justin Salamon, Prem Seetharaman, John Hershey

We introduce the Free Universal Sound Separation (FUSS) dataset, a new corpus for experiments in separating mixtures of an unknown number of sounds from an open domain of sound types.

Data Augmentation

Unsupervised Sound Separation Using Mixture Invariant Training

no code implementations NeurIPS 2020 Scott Wisdom, Efthymios Tzinis, Hakan Erdogan, Ron J. Weiss, Kevin Wilson, John R. Hershey

In such supervised approaches, a model is trained to predict the component sources from synthetic mixtures created by adding up isolated ground-truth sources.

Speech Enhancement Speech Separation +1

Sequential Multi-Frame Neural Beamforming for Speech Separation and Enhancement

no code implementations18 Nov 2019 Zhong-Qiu Wang, Hakan Erdogan, Scott Wisdom, Kevin Wilson, Desh Raj, Shinji Watanabe, Zhuo Chen, John R. Hershey

This work introduces sequential neural beamforming, which alternates between neural network based spectral separation and beamforming based spatial separation.

Speaker Separation Speech Enhancement +3

Low-Latency Speaker-Independent Continuous Speech Separation

no code implementations13 Apr 2019 Takuya Yoshioka, Zhuo Chen, Changliang Liu, Xiong Xiao, Hakan Erdogan, Dimitrios Dimitriadis

Speaker independent continuous speech separation (SI-CSS) is a task of converting a continuous audio stream, which may contain overlapping voices of unknown speakers, into a fixed number of continuous signals each of which contains no overlapping speech segment.

speech-recognition Speech Recognition +1

SDR - half-baked or well done?

1 code implementation6 Nov 2018 Jonathan Le Roux, Scott Wisdom, Hakan Erdogan, John R. Hershey

In speech enhancement and source separation, signal-to-noise ratio is a ubiquitous objective measure of denoising/separation quality.

Sound Audio and Speech Processing

Recognizing Overlapped Speech in Meetings: A Multichannel Separation Approach Using Neural Networks

no code implementations8 Oct 2018 Takuya Yoshioka, Hakan Erdogan, Zhuo Chen, Xiong Xiao, Fil Alleva

The goal of this work is to develop a meeting transcription system that can recognize speech even when utterances of different speakers are overlapped.

speech-recognition Speech Recognition +1

Deep neural networks for single channel source separation

no code implementations12 Nov 2013 Emad M. Grais, Mehmet Umut Sen, Hakan Erdogan

In the training stage, the training data for the source signals are used to train a DNN.

SUTAV: A Turkish Audio-Visual Database

no code implementations LREC 2012 Ibrahim Saygin Topkaya, Hakan Erdogan

The main aim of collecting SUTAV database was to obtain a large audio-visual collection of spoken words, numbers and sentences in Turkish language.

Audio-Visual Speech Recognition Person Identification +2

Confidence-Based Dynamic Classifier Combination For Mean-Shift Tracking

no code implementations29 Jul 2011 Ibrahim Saygin Topkaya, Hakan Erdogan

We use two different classifiers, where one comes from a background modeling method, to generate the weight image and we calculate contributions of the classifiers dynamically using their confidences to generate a final weight image to be used in tracking.

Object Tracking

Cannot find the paper you are looking for? You can Submit a new open access paper.