Search Results for author: Panayiotis Georgiou

Found 29 papers, 5 papers with code

Which ASR should I choose for my dialogue system?

no code implementations • WS 2013 • Fabrizio Morbini, Kartik Audhkhasi, Kenji Sagae, Ron artstein, Do{\u{g}}an Can, Panayiotis Georgiou, Shri Narayanan, Anton Leuski, David Traum

Speech Recognition

Paper
Add Code

Sparsely Connected and Disjointly Trained Deep Neural Networks for Low Resource Behavioral Annotation: Acoustic Classification in Couples' Therapy

no code implementations • 14 Jun 2016 • Haoqi Li, Brian Baucom, Panayiotis Georgiou

We propose a Sparsely-Connected and Disjointly-Trained DNN (SD-DNN) framework to deal with limited data.

General Classification

Paper
Add Code

Unsupervised Latent Behavior Manifold Learning from Acoustic Features: audio2behavior

no code implementations • 12 Jan 2017 • Haoqi Li, Brian Baucom, Panayiotis Georgiou

Behavioral annotation using signal processing and machine learning is highly dependent on training data and manual annotations of behavioral labels.

Paper
Add Code

Learning from Past Mistakes: Improving Automatic Speech Recognition Output via Noisy-Clean Phrase Context Modeling

1 code implementation • 7 Feb 2018 • Prashanth Gurunath Shivakumar, Haoqi Li, Kevin Knight, Panayiotis Georgiou

In this work we model ASR as a phrase-based noisy transformation channel and propose an error correction system that can learn from the aggregate errors of all the independent modules constituting the ASR and attempt to invert those.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Code

Neural Predictive Coding using Convolutional Neural Networks towards Unsupervised Learning of Speaker Characteristics

no code implementations • 22 Feb 2018 • Arindam Jati, Panayiotis Georgiou

Two sets of experiments are done in different scenarios to evaluate the strength of NPC embeddings and compare with state-of-the-art in-domain supervised methods.

Speaker Identification Speaker Recognition +3

Paper
Add Code

Towards an Unsupervised Entrainment Distance in Conversational Speech using Deep Neural Networks

no code implementations • 23 Apr 2018 • Md Nasir, Brian Baucom, Shrikanth Narayanan, Panayiotis Georgiou

Entrainment is a known adaptation mechanism that causes interaction participants to adapt or synchronize their acoustic characteristics.

Paper
Add Code

Transfer Learning from Adult to Children for Speech Recognition: Evaluation, Analysis and Recommendations

no code implementations • 8 May 2018 • Prashanth Gurunath Shivakumar, Panayiotis Georgiou

Evaluations are presented on (i) comparisons of earlier GMM-HMM and the newer DNN Models, (ii) effectiveness of standard adaptation techniques versus transfer learning, (iii) various adaptation configurations in tackling the variabilities present in children speech, in terms of (a) acoustic spectral variability, and (b) pronunciation variability and linguistic constraints.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

Modeling Interpersonal Influence of Verbal Behavior in Couples Therapy Dyadic Interactions

no code implementations • 23 May 2018 • Sandeep Nallan Chakravarthula, Brian Baucom, Panayiotis Georgiou

Dyadic interactions among humans are marked by speakers continuously influencing and reacting to each other in terms of responses and behaviors, among others.

Paper
Add Code

Unsupervised Online Multitask Learning of Behavioral Sentence Embeddings

no code implementations • 18 Jul 2018 • Shao-Yen Tseng, Brian Baucom, Panayiotis Georgiou

Unsupervised learning has been an attractive method for easily deriving meaningful data representations from vast amounts of unlabeled data.

Domain Adaptation Sentence +1

Paper
Add Code

Multi-label Multi-task Deep Learning for Behavioral Coding

no code implementations • 29 Oct 2018 • James Gibson, David C. Atkins, Torrey Creed, Zac Imel, Panayiotis Georgiou, Shrikanth Narayanan

We propose a methodology for estimating human behaviors in psychotherapy sessions using mutli-label and multi-task learning paradigms.

Multi-Task Learning

Paper
Add Code

Confusion2Vec: Towards Enriching Vector Space Word Representations with Representational Ambiguities

no code implementations • 8 Nov 2018 • Prashanth Gurunath Shivakumar, Panayiotis Georgiou

In this paper, we propose a novel word vector representation, Confusion2Vec, motivated from the human speech production and perception that encodes representational ambiguity.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Speaker Diarization With Lexical Information

no code implementations • 27 Nov 2018 • Tae Jin Park, Kyu Han, Ian Lane, Panayiotis Georgiou

This work presents a novel approach to leverage lexical information for speaker diarization.

Clustering speaker-diarization +1

Paper
Add Code

Spoken Language Intent Detection using Confusion2Vec

1 code implementation • 7 Apr 2019 • Prashanth Gurunath Shivakumar, Mu Yang, Panayiotis Georgiou

In this paper, we address the spoken language intent detection under noisy conditions imposed by automatic speech recognition (ASR) systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Code

Modeling Interpersonal Linguistic Coordination in Conversations using Word Mover's Distance

no code implementations • 12 Apr 2019 • Md Nasir, Sandeep Nallan Chakravarthula, Brian Baucom, David C. Atkins, Panayiotis Georgiou, Shrikanth Narayanan

We find that our proposed measure is correlated with the therapist's empathy towards their patient in Motivational Interviewing and with affective behaviors in Couples Therapy.

Paper
Add Code

Predicting Behavior in Cancer-Afflicted Patient and Spouse Interactions using Speech and Language

no code implementations • 2 Aug 2019 • Sandeep Nallan Chakravarthula, Haoqi Li, Shao-Yen Tseng, Maija Reblin, Panayiotis Georgiou

Cancer impacts the quality of life of those diagnosed as well as their spouse caregivers, in addition to potentially influencing their day-to-day behaviors.

Paper
Add Code

Behavior Gated Language Models

no code implementations • 31 Aug 2019 • Prashanth Gurunath Shivakumar, Shao-Yen Tseng, Panayiotis Georgiou, Shrikanth Narayanan

In this work we derive motivation from psycholinguistics and propose the addition of behavioral information into the context of language modeling.

Language Modelling

Paper
Add Code

Multimodal Embeddings from Language Models

1 code implementation • 10 Sep 2019 • Shao-Yen Tseng, Panayiotis Georgiou, Shrikanth Narayanan

Word embeddings such as ELMo have recently been shown to model word semantics with greater efficacy through contextualized learning on large-scale language corpora, resulting in significant improvement in state of the art across many natural language tasks.

Emotion Recognition Language Modelling +1

Paper
Code

Linking emotions to behaviors through deep transfer learning

1 code implementation • 8 Oct 2019 • Haoqi Li, Brian Baucom, Panayiotis Georgiou

Further, we investigate the importance of emotional-context in the expression of behavior by constraining (or not) the neural networks' contextual view of the data.

Emotion Recognition Transfer Learning

Paper
Code

RNN based Incremental Online Spoken Language Understanding

no code implementations • 23 Oct 2019 • Prashanth Gurunath Shivakumar, Naveen Kumar, Panayiotis Georgiou, Shrikanth Narayanan

We introduce and analyze different recurrent neural network architectures for incremental and online processing of the ASR transcripts and compare it to the existing offline systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +8

Paper
Add Code

Speaker-invariant Affective Representation Learning via Adversarial Training

no code implementations • 4 Nov 2019 • Haoqi Li, Ming Tu, Jing Huang, Shrikanth Narayanan, Panayiotis Georgiou

In this paper, we propose a machine learning framework to obtain speech emotion representations by limiting the effect of speaker variability in the speech signals.

Emotion Classification Representation Learning +1

Paper
Add Code

An analysis of observation length requirements for machine understanding of human behaviors from spoken language

no code implementations • 21 Nov 2019 • Sandeep Nallan Chakravarthula, Brian Baucom, Shrikanth Narayanan, Panayiotis Georgiou

In this paper, we investigate this link and present an analysis framework that determines appropriate window lengths for the task of behavior estimation.

Paper
Add Code

Speaker Diarization with Lexical Information

no code implementations • 13 Apr 2020 • Tae Jin Park, Kyu J. Han, Jing Huang, Xiaodong He, Bo-Wen Zhou, Panayiotis Georgiou, Shrikanth Narayanan

This work presents a novel approach for speaker diarization to leverage lexical information provided by automatic speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Paper
Add Code

Confusion2vec 2.0: Enriching Ambiguous Spoken Language Representations with Subwords

1 code implementation • 3 Feb 2021 • Prashanth Gurunath Shivakumar, Panayiotis Georgiou, Shrikanth Narayanan

Confusion2vec, motivated from human speech production and perception, is a word vector representation which encodes ambiguities present in human spoken language in addition to semantics and syntactic information.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

Paper
Code

Automated Evaluation Of Psychotherapy Skills Using Speech And Language Technologies

no code implementations • 22 Feb 2021 • Nikolaos Flemotomos, Victor R. Martinez, Zhuohao Chen, Karan Singla, Victor Ardulov, Raghuveer Peri, Derek D. Caperton, James Gibson, Michael J. Tanana, Panayiotis Georgiou, Jake Van Epps, Sarah P. Lord, Tad Hirsch, Zac E. Imel, David C. Atkins, Shrikanth Narayanan

With the growing prevalence of psychological interventions, it is vital to have measures which rate the effectiveness of psychological care to assist in training, supervision, and quality assurance of services.

Paper
Add Code

Unsupervised Speech Representation Learning for Behavior Modeling using Triplet Enhanced Contextualized Networks

no code implementations • 1 Apr 2021 • Haoqi Li, Brian Baucom, Shrikanth Narayanan, Panayiotis Georgiou

In this paper, we exploit the stationary properties of human behavior within an interaction and present a representation learning method to capture behavioral information from speech in an unsupervised way.

Representation Learning

Paper
Add Code

Analysis and Tuning of a Voice Assistant System for Dysfluent Speech

no code implementations • 18 Jun 2021 • Vikramjit Mitra, Zifang Huang, Colin Lea, Lauren Tooley, Sarah Wu, Darren Botten, Ashwini Palekar, Shrinath Thelapurath, Panayiotis Georgiou, Sachin Kajarekar, Jefferey Bigham

Dysfluencies and variations in speech pronunciation can severely degrade speech recognition performance, and for many individuals with moderate-to-severe speech disorders, voice operated systems do not work.

Intent Recognition speech-recognition +1

Paper
Add Code

CALM: Contrastive Aligned Audio-Language Multirate and Multimodal Representations

no code implementations • 8 Feb 2022 • Vin Sachidananda, Shao-Yen Tseng, Erik Marchi, Sachin Kajarekar, Panayiotis Georgiou

By aligning audio representations to pretrained language representations and utilizing contrastive information between acoustic inputs, CALM is able to bootstrap audio embedding competitive with existing audio representation models in only a few hours of training time.

Emotion Recognition Natural Language Understanding

Paper
Add Code

Multimodal Data and Resource Efficient Device-Directed Speech Detection with Large Foundation Models

no code implementations • 6 Dec 2023 • Dominik Wagner, Alexander Churchill, Siddharth Sigtia, Panayiotis Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi

We compare the proposed system to unimodal baselines and show that the multimodal approach achieves lower equal-error-rates (EERs), while using only a fraction of the training data.

Automatic Speech Recognition Language Modelling +3

Paper
Add Code

A Multimodal Approach to Device-Directed Speech Detection with Large Language Models

no code implementations • 21 Mar 2024 • Dominik Wagner, Alexander Churchill, Siddharth Sigtia, Panayiotis Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi

Interactions with virtual assistants typically start with a predefined trigger phrase followed by the user command.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.