Search Results for author: Vidhyasaharan Sethu

Found 10 papers, 2 papers with code

Blind Estimation of Sub-band Acoustic Parameters from Ambisonics Recordings using Spectro-Spatial Covariance Features

no code implementations5 Nov 2024 Hanyu Meng, Jeroen Breebaart, Jeremy Stoddard, Vidhyasaharan Sethu, Eliathamby Ambikairajah

Additionally, we introduce FOA-Conv3D, a novel back-end network for effectively utilising the SSCV feature with a 3D convolutional encoder.

AER-LLM: Ambiguity-aware Emotion Recognition Leveraging Large Language Models

no code implementations26 Sep 2024 Xin Hong, Yuan Gong, Vidhyasaharan Sethu, Ting Dang

Recent advancements in Large Language Models (LLMs) have demonstrated great success in many Natural Language Processing (NLP) tasks.

Emotional Intelligence Emotion Recognition +1

A Joint Spectro-Temporal Relational Thinking Based Acoustic Modeling Framework

no code implementations17 Sep 2024 Zheng Nan, Ting Dang, Vidhyasaharan Sethu, Beena Ahmed

Despite the crucial role relational thinking plays in human understanding of speech, it has yet to be leveraged in any artificial speech recognition systems.

speech-recognition Speech Recognition

Dual-Constrained Dynamical Neural ODEs for Ambiguity-aware Continuous Emotion Prediction

1 code implementation31 Jul 2024 Jingyao Wu, Ting Dang, Vidhyasaharan Sethu, Eliathamby Ambikairajah

There has been a significant focus on modelling emotion ambiguity in recent years, with advancements made in representing emotions as distributions to capture ambiguity.

Time Series

Binaural Selective Attention Model for Target Speaker Extraction

no code implementations18 Jun 2024 Hanyu Meng, Qiquan Zhang, Xiangyu Zhang, Vidhyasaharan Sethu, Eliathamby Ambikairajah

The remarkable ability of humans to selectively focus on a target speaker in cocktail party scenarios is facilitated by binaural audio processing.

Target Speaker Extraction

Spatial HuBERT: Self-supervised Spatial Speech Representation Learning for a Single Talker from Multi-channel Audio

no code implementations17 Oct 2023 Antoni Dimitriadis, Siqi Pan, Vidhyasaharan Sethu, Beena Ahmed

Spatial HuBERT learns representations that outperform state-of-the-art single-channel speech representations on a variety of spatial downstream tasks, particularly in reverberant and noisy environments.

Representation Learning Self-Supervised Learning +1

Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling

no code implementations21 Sep 2023 Zheng Nan, Ting Dang, Vidhyasaharan Sethu, Beena Ahmed

Connectionist temporal classification (CTC) is commonly adopted for sequence modeling tasks like speech recognition, where it is necessary to preserve order between the input and target sequences.

Classification speech-recognition +1

A Novel Markovian Framework for Integrating Absolute and Relative Ordinal Emotion Information

no code implementations10 Aug 2021 Jingyao Wu, Ting Dang, Vidhyasaharan Sethu, Eliathamby Ambikairajah

We propose a Markovian framework referred to as Dynamic Ordinal Markov Model (DOMM) that makes use of both absolute and relative ordinal information, to improve speech based ordinal emotion prediction.

The Ambiguous World of Emotion Representation

no code implementations1 Sep 2019 Vidhyasaharan Sethu, Emily Mower Provost, Julien Epps, Carlos Busso, NIcholas Cummins, Shrikanth Narayanan

A key reason for this is the lack of a common mathematical framework to describe all the relevant elements of emotion representations.

Face Recognition Speaker Verification +2

Cannot find the paper you are looking for? You can Submit a new open access paper.