Search Results for author: Catherine Lai

Found 15 papers, 1 papers with code

Context-sensitive evaluation of automatic speech recognition: considering user experience & language variation

no code implementations • EACL (HCINLP) 2021 • Nina Markl, Catherine Lai

Commercial Automatic Speech Recognition (ASR) systems tend to show systemic predictive bias for marginalised speaker/user groups.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Layer-Wise Analysis of Self-Supervised Acoustic Word Embeddings: A Study on Speech Emotion Recognition

no code implementations • 4 Feb 2024 • Alexandra Saliba, Yuanchao Li, Ramon Sanabria, Catherine Lai

Through a comparative experiment and a layer-wise accuracy analysis on two distinct corpora, IEMOCAP and ESD, we explore differences between AWEs and raw self-supervised representations, as well as the proper utilization of AWEs alone and in combination with word embeddings.

Speech Emotion Recognition Word Embeddings

Paper
Add Code

Quantifying the perceptual value of lexical and non-lexical channels in speech

no code implementations • 7 Jul 2023 • Sarenne Wallbridge, Peter Bell, Catherine Lai

Speech is a fundamental means of communication that can be seen to provide two channels for transmitting information: the lexical channel of which words are said, and the non-lexical channel of how they are spoken.

Paper
Add Code

ASR and Emotional Speech: A Word-Level Investigation of the Mutual Impact of Speech and Emotion Recognition

no code implementations • 25 May 2023 • Yuanchao Li, Zeyu Zhao, Ondrej Klejch, Peter Bell, Catherine Lai

To overcome this challenge, we investigate how Automatic Speech Recognition (ASR) performs on emotional speech by analyzing the ASR performance on emotion corpora and examining the distribution of word errors and confidence scores in ASR transcripts to gain insight into how emotion affects ASR.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Transfer Learning for Personality Perception via Speech Emotion Recognition

no code implementations • 25 May 2023 • Yuanchao Li, Peter Bell, Catherine Lai

In this work, we investigate the relationship between two affective attributes: personality and emotion, from a transfer learning perspective.

Speech Emotion Recognition Transfer Learning

Paper
Add Code

Cross-Attention is Not Enough: Incongruity-Aware Dynamic Hierarchical Fusion for Multimodal Affect Recognition

no code implementations • 23 May 2023 • Yaoting Wang, Yuanchao Li, Paul Pu Liang, Louis-Philippe Morency, Peter Bell, Catherine Lai

Fusing multiple modalities has proven effective for multimodal information processing.

Emotion Recognition Multimodal Sentiment Analysis

Paper
Add Code

Exploration of A Self-Supervised Speech Model: A Study on Emotional Corpora

no code implementations • 5 Oct 2022 • Yuanchao Li, Yumnah Mohamied, Peter Bell, Catherine Lai

Self-supervised speech models have grown fast during the past few years and have proven feasible for use in various downstream tasks.

Emotion Recognition

Paper
Add Code

Analysis of Voice Conversion and Code-Switching Synthesis Using VQ-VAE

no code implementations • 28 Mar 2022 • Shuvayanti Das, Jennifer Williams, Catherine Lai

We found that speech synthesis quality degrades after increasing the number of language switches within an utterance and decreasing the number of words.

Speech Synthesis Voice Conversion

Paper
Add Code

A Cross-Domain Approach for Continuous Impression Recognition from Dyadic Audio-Visual-Physio Signals

no code implementations • 25 Mar 2022 • Yuanchao Li, Catherine Lai

In this paper, we perform impression recognition using a proposed cross-domain architecture on the dyadic IMPRESSION dataset.

Knowledge Distillation Spoken Dialogue Systems

Paper
Add Code

Robotic Speech Synthesis: Perspectives on Interactions, Scenarios, and Ethics

no code implementations • 17 Mar 2022 • Yuanchao Li, Catherine Lai

In recent years, many works have investigated the feasibility of conversational robots for performing specific tasks, such as healthcare and interview.

Ethics Speech Synthesis

Paper
Add Code

Fusing ASR Outputs in Joint Training for Speech Emotion Recognition

no code implementations • 29 Oct 2021 • Yuanchao Li, Peter Bell, Catherine Lai

However, due to the scarcity of emotion labelled data and the difficulty of recognizing emotional speech, it is hard to obtain reliable linguistic features and models in this research area.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Location, Location: Enhancing the Evaluation of Text-to-Speech Synthesis Using the Rapid Prosody Transcription Paradigm

no code implementations • 6 Jul 2021 • Elijah Gutierrez, Pilar Oplustil-Gallegos, Catherine Lai

Text-to-Speech synthesis systems are generally evaluated using Mean Opinion Score (MOS) tests, where listeners score samples of synthetic speech on a Likert scale.

Speech Synthesis Text-To-Speech Synthesis

Paper
Add Code

It's not what you said, it's how you said it: discriminative perception of speech as a multichannel communication system

no code implementations • 1 May 2021 • Sarenne Wallbridge, Peter Bell, Catherine Lai

People convey information extremely effectively through spoken interaction using multiple channels of information transmission: the lexical channel of what is said, and the non-lexical channel of how it is said.

Paper
Add Code

Perception of prosodic variation for speech synthesis using an unsupervised discrete representation of F0

1 code implementation • 14 Mar 2020 • Zack Hodari, Catherine Lai, Simon King

In English, prosody adds a broad range of information to segment sequences, from information structure (e. g. contrast) to stylistic variation (e. g. expression of emotion).

Clustering Representation Learning +1

Paper
Code

Polarity and Intensity: the Two Aspects of Sentiment Analysis

no code implementations • WS 2018 • Leimin Tian, Catherine Lai, Johanna D. Moore

In particular, we build unimodal and multimodal multi-task learning models with sentiment score prediction as the main task and polarity and/or intensity classification as the auxiliary tasks.

General Classification Multimodal Sentiment Analysis +2

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.