Search Results for author: Karim Helwani

Found 8 papers, 0 papers with code

Sound Source Separation Using Latent Variational Block-Wise Disentanglement

no code implementations8 Feb 2024 Karim Helwani, Masahito Togami, Paris Smaragdis, Michael M. Goodwin

In this paper, we present a hybrid classical digital signal processing/deep neural network (DSP/DNN) approach to source separation (SS) highlighting the theoretical link between variational autoencoder and classical approaches to SS.

Disentanglement

Neural Harmonium: An Interpretable Deep Structure for Nonlinear Dynamic System Identification with Application to Audio Processing

no code implementations10 Oct 2023 Karim Helwani, Erfan Soltanmohammadi, Michael M. Goodwin

Improving the interpretability of deep neural networks has recently gained increased attention, especially when the power of deep learning is leveraged to solve problems in physics.

Acoustic echo cancellation Audio Signal Processing

Learning Linear Groups in Neural Networks

no code implementations29 May 2023 Emmanouil Theodosis, Karim Helwani, Demba Ba

Employing equivariance in neural networks leads to greater parameter efficiency and improved generalization performance through the encoding of domain knowledge in the architecture; however, the majority of existing approaches require an a priori specification of the desired symmetries.

Robust Audio Anomaly Detection

no code implementations3 Feb 2022 Wo Jae Lee, Karim Helwani, Arvindh Krishnaswamy, Srikanth Tenneti

The presented approach doesn't assume the presence of labeled anomalies in the training dataset and uses a novel deep neural network architecture to learn the temporal dynamics of the multivariate time series at multiple resolutions while being robust to contaminations in the training dataset.

Anomaly Detection Time Series +1

Enhancing Audio Augmentation Methods with Consistency Learning

no code implementations9 Feb 2021 Turab Iqbal, Karim Helwani, Arvindh Krishnaswamy, Wenwu Wang

For tasks such as classification, there is a good case for learning representations of the data that are invariant to such transformations, yet this is not explicitly enforced by classification losses such as the cross-entropy loss.

Audio Classification Audio Tagging +2

PoCoNet: Better Speech Enhancement with Frequency-Positional Embeddings, Semi-Supervised Conversational Data, and Biased Loss

no code implementations11 Aug 2020 Umut Isik, Ritwik Giri, Neerad Phansalkar, Jean-Marc Valin, Karim Helwani, Arvindh Krishnaswamy

Neural network applications generally benefit from larger-sized models, but for current speech enhancement models, larger scale networks often suffer from decreased robustness to the variety of real-world use cases beyond what is encountered in training data.

Speech Enhancement

Cannot find the paper you are looking for? You can Submit a new open access paper.