Search Results for author: Karim Helwani

Found 8 papers, 0 papers with code

Sound Source Separation Using Latent Variational Block-Wise Disentanglement

no code implementations • 8 Feb 2024 • Karim Helwani, Masahito Togami, Paris Smaragdis, Michael M. Goodwin

In this paper, we present a hybrid classical digital signal processing/deep neural network (DSP/DNN) approach to source separation (SS) highlighting the theoretical link between variational autoencoder and classical approaches to SS.

Disentanglement

Paper
Add Code

Real-time Stereo Speech Enhancement with Spatial-Cue Preservation based on Dual-Path Structure

no code implementations • 1 Feb 2024 • Masahito Togami, Jean-Marc Valin, Karim Helwani, Ritwik Giri, Umut Isik, Michael M. Goodwin

The algorithm runs in real-time on 10-ms frames with a 40 ms of look-ahead.

Speech Enhancement

Paper
Add Code

Neural Harmonium: An Interpretable Deep Structure for Nonlinear Dynamic System Identification with Application to Audio Processing

no code implementations • 10 Oct 2023 • Karim Helwani, Erfan Soltanmohammadi, Michael M. Goodwin

Improving the interpretability of deep neural networks has recently gained increased attention, especially when the power of deep learning is leveraged to solve problems in physics.

Acoustic echo cancellation Audio Signal Processing

Paper
Add Code

NoLACE: Improving Low-Complexity Speech Codec Enhancement Through Adaptive Temporal Shaping

no code implementations • 25 Sep 2023 • Jan Büthe, Ahmed Mustafa, Jean-Marc Valin, Karim Helwani, Michael M. Goodwin

Speech codec enhancement methods are designed to remove distortions added by speech codecs.

Paper
Add Code

Learning Linear Groups in Neural Networks

no code implementations • 29 May 2023 • Emmanouil Theodosis, Karim Helwani, Demba Ba

Employing equivariance in neural networks leads to greater parameter efficiency and improved generalization performance through the encoding of domain knowledge in the architecture; however, the majority of existing approaches require an a priori specification of the desired symmetries.

Paper
Add Code

Robust Audio Anomaly Detection

no code implementations • 3 Feb 2022 • Wo Jae Lee, Karim Helwani, Arvindh Krishnaswamy, Srikanth Tenneti

The presented approach doesn't assume the presence of labeled anomalies in the training dataset and uses a novel deep neural network architecture to learn the temporal dynamics of the multivariate time series at multiple resolutions while being robust to contaminations in the training dataset.

Anomaly Detection Time Series +1

Paper
Add Code

Enhancing Audio Augmentation Methods with Consistency Learning

no code implementations • 9 Feb 2021 • Turab Iqbal, Karim Helwani, Arvindh Krishnaswamy, Wenwu Wang

For tasks such as classification, there is a good case for learning representations of the data that are invariant to such transformations, yet this is not explicitly enforced by classification losses such as the cross-entropy loss.

Audio Classification Audio Tagging +2

Paper
Add Code

PoCoNet: Better Speech Enhancement with Frequency-Positional Embeddings, Semi-Supervised Conversational Data, and Biased Loss

no code implementations • 11 Aug 2020 • Umut Isik, Ritwik Giri, Neerad Phansalkar, Jean-Marc Valin, Karim Helwani, Arvindh Krishnaswamy

Neural network applications generally benefit from larger-sized models, but for current speech enhancement models, larger scale networks often suffer from decreased robustness to the variety of real-world use cases beyond what is encountered in training data.

Ranked #8 on Speech Enhancement on Deep Noise Suppression (DNS) Challenge

Speech Enhancement

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.