no code implementations • 23 Sep 2023 • Aashish Gottipati, Sami Khairy, Gabriel Mittag, Vishak Gopal, Ross Cutler
In this work, we tackle the problem of bandwidth estimation (BWE) for real-time communication systems; however, in contrast to previous works, we leverage the vast efforts of prior heuristic-based BWE methods and synergize these approaches with deep learning-based techniques.
no code implementations • 15 Sep 2023 • Ilya Gurvich, Ido Leichter, Dharmendar Reddy Palle, Yossi Asher, Alon Vinnikov, Igor Abramovski, Vishak Gopal, Ross Cutler, Eyal Krupka
We introduce a distinctive real-time, causal, neural network-based active speaker detection system optimized for low-power edge computing.
1 code implementation • 22 Mar 2023 • Gabriel Mittag, Babak Naderi, Vishak Gopal, Ross Cutler
Using these features together with VMAF core features, our proposed model achieves a PCC of 0. 99 on the validation set.
1 code implementation • 27 Feb 2022 • Harishchandra Dubey, Vishak Gopal, Ross Cutler, Ashkan Aazami, Sergiy Matusevych, Sebastian Braun, Sefik Emre Eskimez, Manthan Thakker, Takuya Yoshioka, Hannes Gamper, Robert Aichner
We open-source datasets and test sets for researchers to train their deep noise suppression models, as well as a subjective evaluation framework based on ITU-T P. 835 to rate and rank-order the challenge entries.
no code implementations • 8 Oct 2021 • Jerry Chee, Sebastian Braun, Vishak Gopal, Ross Cutler
We study the role of magnitude structured pruning as an architecture search to speed up the inference time of a deep noise suppression (DNS) model.
no code implementations • 5 Oct 2021 • Chandan K A Reddy, Vishak Gopal, Ross Cutler
In this work, we train an objective metric based on P. 835 human ratings that outputs 3 scores: i) speech quality (SIG), ii) background noise quality (BAK), and iii) the overall quality (OVRL) of the audio.
2 code implementations • 6 Jan 2021 • Chandan K A Reddy, Harishchandra Dubey, Kazuhito Koishida, Arun Nair, Vishak Gopal, Ross Cutler, Sebastian Braun, Hannes Gamper, Robert Aichner, Sriram Srinivasan
In this version of the challenge organized at INTERSPEECH 2021, we are expanding both our training and test datasets to accommodate full band scenarios.
no code implementations • 23 Nov 2020 • Jayant Gupchup, Ashkan Aazami, Yaran Fan, Senja Filipi, Tom Finley, Scott Inglis, Marcus Asteborg, Luke Caroll, Rajan Chari, Markus Cozowicz, Vishak Gopal, Vinod Prakash, Sasikanth Bendapudi, Jack Gerrits, Eric Lau, Huazhou Liu, Marco Rossi, Dima Slobodianyk, Dmitri Birjukov, Matty Cooper, Nilesh Javar, Dmitriy Perednya, Sriram Srinivasan, John Langford, Ross Cutler, Johannes Gehrke
Large software systems tune hundreds of 'constants' to optimize their runtime performance.
no code implementations • 28 Oct 2020 • Chandan K A Reddy, Vishak Gopal, Ross Cutler
The no-reference approaches correlate poorly with human ratings and are not widely adopted in the research community.
1 code implementation • 23 Jun 2020 • Jamie Pool, Ebrahim Beyrami, Vishak Gopal, Ashkan Aazami, Jayant Gupchup, Jeff Rowland, Binlong Li, Pritesh Kanani, Ross Cutler, Johannes Gehrke
Web-scale applications can ship code on a daily to weekly cadence.
1 code implementation • 16 May 2020 • Chandan K. A. Reddy, Vishak Gopal, Ross Cutler, Ebrahim Beyrami, Roger Cheng, Harishchandra Dubey, Sergiy Matusevych, Robert Aichner, Ashkan Aazami, Sebastian Braun, Puneet Rana, Sriram Srinivasan, Johannes Gehrke
In this challenge, we open-sourced a large clean speech and noise corpus for training the noise suppression models and a representative test set to real-world scenarios consisting of both synthetic and real recordings.
1 code implementation • 23 Jan 2020 • Chandan K. A. Reddy, Ebrahim Beyrami, Harishchandra Dubey, Vishak Gopal, Roger Cheng, Ross Cutler, Sergiy Matusevych, Robert Aichner, Ashkan Aazami, Sebastian Braun, Puneet Rana, Sriram Srinivasan, Johannes Gehrke
In this challenge, we open-source a large clean speech and noise corpus for training the noise suppression models and a representative test set to real-world scenarios consisting of both synthetic and real recordings.