Search Results for author: Mandar Gogate

Found 16 papers, 1 papers with code

Audio-Visual Speech Enhancement and Separation by Leveraging Multi-Modal Self-Supervised Embeddings

no code implementations31 Oct 2022 I-Chun Chern, Kuo-Hsuan Hung, Yi-Ting Chen, Tassadaq Hussain, Mandar Gogate, Amir Hussain, Yu Tsao, Jen-Cheng Hou

In summary, our results confirm the effectiveness of our proposed model for the AVSS task with proper fine-tuning strategies, demonstrating that multi-modal self-supervised embeddings obtained from AV-HUBERT can be generalized to audio-visual regression tasks.

Automatic Speech Recognition Lip Reading +5

A Novel Frame Structure for Cloud-Based Audio-Visual Speech Enhancement in Multimodal Hearing-aids

no code implementations24 Oct 2022 Abhijeet Bishnu, Ankit Gupta, Mandar Gogate, Kia Dashtipour, Ahsan Adeel, Amir Hussain, Mathini Sellathurai, Tharmalingam Ratnarajah

In this paper, we design a first of its kind transceiver (PHY layer) prototype for cloud-based audio-visual (AV) speech enhancement (SE) complying with high data rate and low latency requirements of future multimodal hearing assistive technology.

Lip Reading Speech Enhancement

A Novel Speech Intelligibility Enhancement Model based on CanonicalCorrelation and Deep Learning

no code implementations11 Feb 2022 Tassadaq Hussain, Muhammad Diyan, Mandar Gogate, Kia Dashtipour, Ahsan Adeel, Yu Tsao, Amir Hussain

Current deep learning (DL) based approaches to speech intelligibility enhancement in noisy environments are often trained to minimise the feature distance between noise-free speech and enhanced speech signals.

Speech Enhancement

A Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning for Hearing-Assistive Technologies

no code implementations8 Feb 2022 Tassadaq Hussain, Muhammad Diyan, Mandar Gogate, Kia Dashtipour, Ahsan Adeel, Yu Tsao, Amir Hussain

Current deep learning (DL) based approaches to speech intelligibility enhancement in noisy environments are generally trained to minimise the distance between clean and enhanced speech features.

Speech Enhancement

A Novel Temporal Attentive-Pooling based Convolutional Recurrent Architecture for Acoustic Signal Enhancement

no code implementations24 Jan 2022 Tassadaq Hussain, Wei-Chien Wang, Mandar Gogate, Kia Dashtipour, Yu Tsao, Xugang Lu, Adeel Ahsan, Amir Hussain

To address this problem, we propose to integrate a novel temporal attentive-pooling (TAP) mechanism into a conventional convolutional recurrent neural network, termed as TAP-CRNN.

Towards Robust Real-time Audio-Visual Speech Enhancement

no code implementations16 Dec 2021 Mandar Gogate, Kia Dashtipour, Amir Hussain

The human brain contextually exploits heterogeneous sensory information to efficiently perform cognitive tasks including vision and hearing.

Speech Enhancement

Towards Intelligibility-Oriented Audio-Visual Speech Enhancement

1 code implementation18 Nov 2021 Tassadaq Hussain, Mandar Gogate, Kia Dashtipour, Amir Hussain

To the best of our knowledge, this is the first work that exploits the integration of AV modalities with an I-O based loss function for SE.

Speech Enhancement

A Hybrid Persian Sentiment Analysis Framework: Integrating Dependency Grammar Based Rules and Deep Neural Networks

no code implementations30 Sep 2019 Kia Dashtipour, Mandar Gogate, Jingpeng Li, Fengling Jiang, Bin Kong, Amir Hussain

When no pattern is triggered, the framework switches to its subsymbolic counterpart and leverages deep neural networks (DNN) to perform the classification.

Persian Sentiment Analysis

AV Speech Enhancement Challenge using a Real Noisy Corpus

no code implementations30 Sep 2019 Mandar Gogate, Ahsan Adeel, Kia Dashtipour, Peter Derleth, Amir Hussain

This paper presents, a first of its kind, audio-visual (AV) speech enhacement challenge in real-noisy settings.

Speech Enhancement

CochleaNet: A Robust Language-independent Audio-Visual Model for Speech Enhancement

no code implementations23 Sep 2019 Mandar Gogate, Kia Dashtipour, Ahsan Adeel, Amir Hussain

In addition, our work challenges a popular belief that a scarcity of multi-language large vocabulary AV corpus and wide variety of noises is a major bottleneck to build a robust language, speaker and noise independent SE systems.

Speech Enhancement

Contextual Audio-Visual Switching For Speech Enhancement in Real-World Environments

no code implementations28 Aug 2018 Ahsan Adeel, Mandar Gogate, Amir Hussain

In this paper, we introduce a novel contextual AV switching component that contextually exploits AV cues with respect to different operating conditions to estimate clean audio, without requiring any SNR estimation.

Lip Reading Speech Enhancement

DNN driven Speaker Independent Audio-Visual Mask Estimation for Speech Separation

no code implementations31 Jul 2018 Mandar Gogate, Ahsan Adeel, Ricard Marxer, Jon Barker, Amir Hussain

The process of selective attention in the brain is known to contextually exploit the available audio and visual cues to better focus on target speaker while filtering out other noises.

Speech Separation

Cannot find the paper you are looking for? You can Submit a new open access paper.