Search Results for author: Catherine Lai

Found 15 papers, 1 papers with code

Layer-Wise Analysis of Self-Supervised Acoustic Word Embeddings: A Study on Speech Emotion Recognition

no code implementations4 Feb 2024 Alexandra Saliba, Yuanchao Li, Ramon Sanabria, Catherine Lai

Through a comparative experiment and a layer-wise accuracy analysis on two distinct corpora, IEMOCAP and ESD, we explore differences between AWEs and raw self-supervised representations, as well as the proper utilization of AWEs alone and in combination with word embeddings.

Speech Emotion Recognition Word Embeddings

Quantifying the perceptual value of lexical and non-lexical channels in speech

no code implementations7 Jul 2023 Sarenne Wallbridge, Peter Bell, Catherine Lai

Speech is a fundamental means of communication that can be seen to provide two channels for transmitting information: the lexical channel of which words are said, and the non-lexical channel of how they are spoken.

ASR and Emotional Speech: A Word-Level Investigation of the Mutual Impact of Speech and Emotion Recognition

no code implementations25 May 2023 Yuanchao Li, Zeyu Zhao, Ondrej Klejch, Peter Bell, Catherine Lai

To overcome this challenge, we investigate how Automatic Speech Recognition (ASR) performs on emotional speech by analyzing the ASR performance on emotion corpora and examining the distribution of word errors and confidence scores in ASR transcripts to gain insight into how emotion affects ASR.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Transfer Learning for Personality Perception via Speech Emotion Recognition

no code implementations25 May 2023 Yuanchao Li, Peter Bell, Catherine Lai

In this work, we investigate the relationship between two affective attributes: personality and emotion, from a transfer learning perspective.

Speech Emotion Recognition Transfer Learning

Exploration of A Self-Supervised Speech Model: A Study on Emotional Corpora

no code implementations5 Oct 2022 Yuanchao Li, Yumnah Mohamied, Peter Bell, Catherine Lai

Self-supervised speech models have grown fast during the past few years and have proven feasible for use in various downstream tasks.

Emotion Recognition

Analysis of Voice Conversion and Code-Switching Synthesis Using VQ-VAE

no code implementations28 Mar 2022 Shuvayanti Das, Jennifer Williams, Catherine Lai

We found that speech synthesis quality degrades after increasing the number of language switches within an utterance and decreasing the number of words.

Speech Synthesis Voice Conversion

A Cross-Domain Approach for Continuous Impression Recognition from Dyadic Audio-Visual-Physio Signals

no code implementations25 Mar 2022 Yuanchao Li, Catherine Lai

In this paper, we perform impression recognition using a proposed cross-domain architecture on the dyadic IMPRESSION dataset.

Knowledge Distillation Spoken Dialogue Systems

Robotic Speech Synthesis: Perspectives on Interactions, Scenarios, and Ethics

no code implementations17 Mar 2022 Yuanchao Li, Catherine Lai

In recent years, many works have investigated the feasibility of conversational robots for performing specific tasks, such as healthcare and interview.

Ethics Speech Synthesis

Fusing ASR Outputs in Joint Training for Speech Emotion Recognition

no code implementations29 Oct 2021 Yuanchao Li, Peter Bell, Catherine Lai

However, due to the scarcity of emotion labelled data and the difficulty of recognizing emotional speech, it is hard to obtain reliable linguistic features and models in this research area.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Location, Location: Enhancing the Evaluation of Text-to-Speech Synthesis Using the Rapid Prosody Transcription Paradigm

no code implementations6 Jul 2021 Elijah Gutierrez, Pilar Oplustil-Gallegos, Catherine Lai

Text-to-Speech synthesis systems are generally evaluated using Mean Opinion Score (MOS) tests, where listeners score samples of synthetic speech on a Likert scale.

Speech Synthesis Text-To-Speech Synthesis

It's not what you said, it's how you said it: discriminative perception of speech as a multichannel communication system

no code implementations1 May 2021 Sarenne Wallbridge, Peter Bell, Catherine Lai

People convey information extremely effectively through spoken interaction using multiple channels of information transmission: the lexical channel of what is said, and the non-lexical channel of how it is said.

Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0

1 code implementation14 Mar 2020 Zack Hodari, Catherine Lai, Simon King

In English, prosody adds a broad range of information to segment sequences, from information structure (e. g. contrast) to stylistic variation (e. g. expression of emotion).

Clustering Representation Learning +1

Polarity and Intensity: the Two Aspects of Sentiment Analysis

no code implementations WS 2018 Leimin Tian, Catherine Lai, Johanna D. Moore

In particular, we build unimodal and multimodal multi-task learning models with sentiment score prediction as the main task and polarity and/or intensity classification as the auxiliary tasks.

General Classification Multimodal Sentiment Analysis +2

Cannot find the paper you are looking for? You can Submit a new open access paper.