Search Results for author: Ritwik Giri

Found 15 papers, 1 papers with code

A Framework for Unified Real-time Personalized and Non-Personalized Speech Enhancement

no code implementations23 Feb 2023 Zhepei Wang, Ritwik Giri, Devansh Shah, Jean-Marc Valin, Michael M. Goodwin, Paris Smaragdis

In this study, we present an approach to train a single speech enhancement network that can perform both personalized and non-personalized speech enhancement.

Multi-Task Learning Speech Enhancement

Personalized PercepNet: Real-time, Low-complexity Target Voice Separation and Enhancement

no code implementations8 Jun 2021 Ritwik Giri, Shrikant Venkataramani, Jean-Marc Valin, Umut Isik, Arvindh Krishnaswamy

The presence of multiple talkers in the surrounding environment poses a difficult challenge for real-time speech communication systems considering the constraints on network size and complexity.

Semi-Supervised Singing Voice Separation with Noisy Self-Training

no code implementations16 Feb 2021 Zhepei Wang, Ritwik Giri, Umut Isik, Jean-Marc Valin, Arvindh Krishnaswamy

Given a limited set of labeled data, we present a method to leverage a large volume of unlabeled data to improve the model's performance.

Data Augmentation

Enhancing into the codec: Noise Robust Speech Coding with Vector-Quantized Autoencoders

no code implementations12 Feb 2021 Jonah Casebeer, Vinjai Vale, Umut Isik, Jean-Marc Valin, Ritwik Giri, Arvindh Krishnaswamy

Audio codecs based on discretized neural autoencoders have recently been developed and shown to provide significantly higher compression levels for comparable quality speech output.

PoCoNet: Better Speech Enhancement with Frequency-Positional Embeddings, Semi-Supervised Conversational Data, and Biased Loss

no code implementations11 Aug 2020 Umut Isik, Ritwik Giri, Neerad Phansalkar, Jean-Marc Valin, Karim Helwani, Arvindh Krishnaswamy

Neural network applications generally benefit from larger-sized models, but for current speech enhancement models, larger scale networks often suffer from decreased robustness to the variety of real-world use cases beyond what is encountered in training data.

Speech Enhancement

Relevance Subject Machine: A Novel Person Re-identification Framework

no code implementations30 Mar 2017 Igor Fedorov, Ritwik Giri, Bhaskar D. Rao, Truong Q. Nguyen

We propose a novel method called the Relevance Subject Machine (RSM) to solve the person re-identification (re-id) problem.

Person Re-Identification

Robust Bayesian Method for Simultaneous Block Sparse Signal Recovery with Applications to Face Recognition

no code implementations6 May 2016 Igor Fedorov, Ritwik Giri, Bhaskar D. Rao, Truong Q. Nguyen

In this paper, we present a novel Bayesian approach to recover simultaneously block sparse signals in the presence of outliers.

Face Recognition

A Unified Framework for Sparse Non-Negative Least Squares using Multiplicative Updates and the Non-Negative Matrix Factorization Problem

no code implementations7 Apr 2016 Igor Fedorov, Alican Nalci, Ritwik Giri, Bhaskar D. Rao, Truong Q. Nguyen, Harinath Garudadri

We show that the proposed framework encompasses a large class of S-NNLS algorithms and provide a computationally efficient inference procedure based on multiplicative update rules.

Type I and Type II Bayesian Methods for Sparse Signal Recovery using Scale Mixtures

no code implementations17 Jul 2015 Ritwik Giri, Bhaskar D. Rao

In this paper, we propose a generalized scale mixture family of distributions, namely the Power Exponential Scale Mixture (PESM) family, to model the sparsity inducing priors currently in use for sparse signal recovery (SSR).

Vocal Bursts Type Prediction

Cannot find the paper you are looking for? You can Submit a new open access paper.