Search Results for author: Beena Ahmed

Found 6 papers, 1 papers with code

When LLMs Meets Acoustic Landmarks: An Efficient Approach to Integrate Speech into Large Language Models for Depression Detection

no code implementations17 Feb 2024 Xiangyu Zhang, Hexin Liu, Kaishuai Xu, Qiquan Zhang, Daijiao Liu, Beena Ahmed, Julien Epps

In addition, this approach is not only valuable for the detection of depression but also represents a new perspective in enhancing the ability of LLMs to comprehend and process speech signals.

Depression Detection

Phonological Level wav2vec2-based Mispronunciation Detection and Diagnosis Method

no code implementations13 Nov 2023 Mostafa Shahin, Julien Epps, Beena Ahmed

We further propose a multi-label variant of the Connectionist Temporal Classification (CTC) approach to jointly model the non-mutually exclusive speech attributes using a single model.

Attribute

Spatial HuBERT: Self-supervised Spatial Speech Representation Learning for a Single Talker from Multi-channel Audio

no code implementations17 Oct 2023 Antoni Dimitriadis, Siqi Pan, Vidhyasaharan Sethu, Beena Ahmed

Spatial HuBERT learns representations that outperform state-of-the-art single-channel speech representations on a variety of spatial downstream tasks, particularly in reverberant and noisy environments.

Representation Learning Self-Supervised Learning

Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling

no code implementations21 Sep 2023 Zheng Nan, Ting Dang, Vidhyasaharan Sethu, Beena Ahmed

Connectionist temporal classification (CTC) is commonly adopted for sequence modeling tasks like speech recognition, where it is necessary to preserve order between the input and target sequences.

Classification speech-recognition +1

Improving Children's Speech Recognition by Fine-tuning Self-supervised Adult Speech Representations

1 code implementation14 Nov 2022 Renee Lu, Mostafa Shahin, Beena Ahmed

We assess the performance of fine-tuning on both native and non-native children's speech, examine the effect of cross-domain child corpora, and investigate the minimum amount of child speech required to fine-tune a model which outperforms a state-of-the-art adult model.

Self-Supervised Learning speech-recognition +1

Speaker- and Age-Invariant Training for Child Acoustic Modeling Using Adversarial Multi-Task Learning

no code implementations19 Oct 2022 Mostafa Shahin, Beena Ahmed, Julien Epps

These high acoustic variations along with the scarcity of child speech corpora have impeded the development of a reliable speech recognition system for children.

Acoustic Modelling Multi-Task Learning +2

Cannot find the paper you are looking for? You can Submit a new open access paper.