Search Results for author: Mandar Gogate

Found 17 papers, 1 papers with code

Audio-Visual Speech Enhancement Using Self-supervised Learning to Improve Speech Intelligibility in Cochlear Implant Simulations

no code implementations • 15 Jul 2023 • Richard Lee Lai, Jen-Cheng Hou, Mandar Gogate, Kia Dashtipour, Amir Hussain, Yu Tsao

The aim of this study is to explore the effectiveness of audio-visual speech enhancement (AVSE) in enhancing the intelligibility of vocoded speech in cochlear implant (CI) simulations.

Self-Supervised Learning Speech Enhancement

Paper
Add Code

Audio-Visual Speech Enhancement and Separation by Utilizing Multi-Modal Self-Supervised Embeddings

no code implementations • 31 Oct 2022 • I-Chun Chern, Kuo-Hsuan Hung, Yi-Ting Chen, Tassadaq Hussain, Mandar Gogate, Amir Hussain, Yu Tsao, Jen-Cheng Hou

In summary, our results confirm the effectiveness of our proposed model for the AVSS task with proper fine-tuning strategies, demonstrating that multi-modal self-supervised embeddings obtained from AV-HuBERT can be generalized to audio-visual regression tasks.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Add Code

A Novel Frame Structure for Cloud-Based Audio-Visual Speech Enhancement in Multimodal Hearing-aids

no code implementations • 24 Oct 2022 • Abhijeet Bishnu, Ankit Gupta, Mandar Gogate, Kia Dashtipour, Ahsan Adeel, Amir Hussain, Mathini Sellathurai, Tharmalingam Ratnarajah

In this paper, we design a first of its kind transceiver (PHY layer) prototype for cloud-based audio-visual (AV) speech enhancement (SE) complying with high data rate and low latency requirements of future multimodal hearing assistive technology.

Lip Reading Speech Enhancement

Paper
Add Code

A Novel Speech Intelligibility Enhancement Model based on CanonicalCorrelation and Deep Learning

no code implementations • 11 Feb 2022 • Tassadaq Hussain, Muhammad Diyan, Mandar Gogate, Kia Dashtipour, Ahsan Adeel, Yu Tsao, Amir Hussain

Current deep learning (DL) based approaches to speech intelligibility enhancement in noisy environments are often trained to minimise the feature distance between noise-free speech and enhanced speech signals.

Speech Enhancement

Paper
Add Code

A Speech Intelligibility Enhancement Model based on Canonical Correlation and Deep Learning for Hearing-Assistive Technologies

no code implementations • 8 Feb 2022 • Tassadaq Hussain, Muhammad Diyan, Mandar Gogate, Kia Dashtipour, Ahsan Adeel, Yu Tsao, Amir Hussain

Current deep learning (DL) based approaches to speech intelligibility enhancement in noisy environments are generally trained to minimise the distance between clean and enhanced speech features.

Speech Enhancement

Paper
Add Code

A Novel Temporal Attentive-Pooling based Convolutional Recurrent Architecture for Acoustic Signal Enhancement

no code implementations • 24 Jan 2022 • Tassadaq Hussain, Wei-Chien Wang, Mandar Gogate, Kia Dashtipour, Yu Tsao, Xugang Lu, Adeel Ahsan, Amir Hussain

To address this problem, we propose to integrate a novel temporal attentive-pooling (TAP) mechanism into a conventional convolutional recurrent neural network, termed as TAP-CRNN.

Paper
Add Code

Towards Robust Real-time Audio-Visual Speech Enhancement

no code implementations • 16 Dec 2021 • Mandar Gogate, Kia Dashtipour, Amir Hussain

The human brain contextually exploits heterogeneous sensory information to efficiently perform cognitive tasks including vision and hearing.

Speech Enhancement

Paper
Add Code

Towards Intelligibility-Oriented Audio-Visual Speech Enhancement

1 code implementation • 18 Nov 2021 • Tassadaq Hussain, Mandar Gogate, Kia Dashtipour, Amir Hussain

To the best of our knowledge, this is the first work that exploits the integration of AV modalities with an I-O based loss function for SE.

Speech Enhancement

Paper
Code

A Novel Context-Aware Multimodal Framework for Persian Sentiment Analysis

no code implementations • 3 Mar 2021 • Kia Dashtipour, Mandar Gogate, Erik Cambria, Amir Hussain

Most recent works on sentiment analysis have exploited the text modality.

Multimodal Sentiment Analysis Persian Sentiment Analysis

Paper
Add Code

An Experimental Analysis of Attack Classification Using Machine Learning in IoT Networks

no code implementations • 10 Jan 2021 • Andrew Churcher, Rehmat Ullah, Jawad Ahmad, Sadaqat ur Rehman, Fawad Masood, Mandar Gogate, Fehaid Alqahtani, Boubakr Nour, William J. Buchanan

Based on several parameters such as accuracy, precision, recall, F1 score, and log loss, we experimentally compared the aforementioned ML algorithms.

BIG-bench Machine Learning Binary Classification +3

Paper
Add Code

AV Speech Enhancement Challenge using a Real Noisy Corpus

no code implementations • 30 Sep 2019 • Mandar Gogate, Ahsan Adeel, Kia Dashtipour, Peter Derleth, Amir Hussain

This paper presents, a first of its kind, audio-visual (AV) speech enhacement challenge in real-noisy settings.

Speech Enhancement

Paper
Add Code

A Hybrid Persian Sentiment Analysis Framework: Integrating Dependency Grammar Based Rules and Deep Neural Networks

no code implementations • 30 Sep 2019 • Kia Dashtipour, Mandar Gogate, Jingpeng Li, Fengling Jiang, Bin Kong, Amir Hussain

When no pattern is triggered, the framework switches to its subsymbolic counterpart and leverages deep neural networks (DNN) to perform the classification.

Persian Sentiment Analysis

Paper
Add Code

CochleaNet: A Robust Language-independent Audio-Visual Model for Speech Enhancement

no code implementations • 23 Sep 2019 • Mandar Gogate, Kia Dashtipour, Ahsan Adeel, Amir Hussain

In addition, our work challenges a popular belief that a scarcity of multi-language large vocabulary AV corpus and wide variety of noises is a major bottleneck to build a robust language, speaker and noise independent SE systems.

Speech Enhancement

Paper
Add Code

Contextual Audio-Visual Switching For Speech Enhancement in Real-World Environments

no code implementations • 28 Aug 2018 • Ahsan Adeel, Mandar Gogate, Amir Hussain

In this paper, we introduce a novel contextual AV switching component that contextually exploits AV cues with respect to different operating conditions to estimate clean audio, without requiring any SNR estimation.

Lip Reading Speech Enhancement

Paper
Add Code

Exploiting Deep Learning for Persian Sentiment Analysis

no code implementations • 15 Aug 2018 • Kia Dashtipour, Mandar Gogate, Ahsan Adeel, Cosimo Ieracitano, Hadi Larijani, Amir Hussain

The rise of social media is enabling people to freely express their opinions about products and services.

BIG-bench Machine Learning Persian Sentiment Analysis

Paper
Add Code

Lip-Reading Driven Deep Learning Approach for Speech Enhancement

no code implementations • 31 Jul 2018 • Ahsan Adeel, Mandar Gogate, Amir Hussain, William M. Whitmer

The proposed audio-visual (AV) speech enhancement framework operates at two levels.

Acoustic Modelling Lip Reading +2

Paper
Add Code

DNN driven Speaker Independent Audio-Visual Mask Estimation for Speech Separation

no code implementations • 31 Jul 2018 • Mandar Gogate, Ahsan Adeel, Ricard Marxer, Jon Barker, Amir Hussain

The process of selective attention in the brain is known to contextually exploit the available audio and visual cues to better focus on target speaker while filtering out other noises.

Speech Separation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.