no code implementations • 22 Aug 2021 • Krishna D N
We propose using large self-supervised pre-trained models for both audio and text modality with cross-modality attention for multimodal emotion recognition.
no code implementations • 22 Aug 2021 • Krishna D N
We use a phoneme decoder (PHN-DEC) for the phoneme recognition task and a grapheme decoder (GRP-DEC) to predict grapheme sequence.
no code implementations • 22 Aug 2021 • Krishna D N
We jointly optimize the network for phoneme recognition, grapheme recognition, and language identification tasks with Joint CTC-Attention [2] training.
1 code implementation • 4 Nov 2020 • Krishna D N
We propose a multi-modal learning approach by utilising the phoneme information along with audio features for code-switch detection.
no code implementations • 5 Feb 2020 • Krishna D N, Ankita Patil, M. S. P Raj, Sai Prasad H S, Prabhu Aashish Garapati
The generated feature vector is shown to have very good language discriminative features and helps in getting state of the art results for language identification task.