no code implementations • 8 Feb 2022 • Vin Sachidananda, Shao-Yen Tseng, Erik Marchi, Sachin Kajarekar, Panayiotis Georgiou
By aligning audio representations to pretrained language representations and utilizing contrastive information between acoustic inputs, CALM is able to bootstrap audio embedding competitive with existing audio representation models in only a few hours of training time.
no code implementations • 18 Jun 2021 • Vikramjit Mitra, Zifang Huang, Colin Lea, Lauren Tooley, Sarah Wu, Darren Botten, Ashwini Palekar, Shrinath Thelapurath, Panayiotis Georgiou, Sachin Kajarekar, Jefferey Bigham
Dysfluencies and variations in speech pronunciation can severely degrade speech recognition performance, and for many individuals with moderate-to-severe speech disorders, voice operated systems do not work.
no code implementations • 1 Apr 2021 • Haoqi Li, Brian Baucom, Shrikanth Narayanan, Panayiotis Georgiou
In this paper, we exploit the stationary properties of human behavior within an interaction and present a representation learning method to capture behavioral information from speech in an unsupervised way.
no code implementations • 22 Feb 2021 • Nikolaos Flemotomos, Victor R. Martinez, Zhuohao Chen, Karan Singla, Victor Ardulov, Raghuveer Peri, Derek D. Caperton, James Gibson, Michael J. Tanana, Panayiotis Georgiou, Jake Van Epps, Sarah P. Lord, Tad Hirsch, Zac E. Imel, David C. Atkins, Shrikanth Narayanan
With the growing prevalence of psychological interventions, it is vital to have measures which rate the effectiveness of psychological care to assist in training, supervision, and quality assurance of services.
1 code implementation • 3 Feb 2021 • Prashanth Gurunath Shivakumar, Panayiotis Georgiou, Shrikanth Narayanan
Confusion2vec, motivated from human speech production and perception, is a word vector representation which encodes ambiguities present in human spoken language in addition to semantics and syntactic information.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+5
no code implementations • 13 Apr 2020 • Tae Jin Park, Kyu J. Han, Jing Huang, Xiaodong He, Bo-Wen Zhou, Panayiotis Georgiou, Shrikanth Narayanan
This work presents a novel approach for speaker diarization to leverage lexical information provided by automatic speech recognition.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 21 Nov 2019 • Sandeep Nallan Chakravarthula, Brian Baucom, Shrikanth Narayanan, Panayiotis Georgiou
In this paper, we investigate this link and present an analysis framework that determines appropriate window lengths for the task of behavior estimation.
no code implementations • 4 Nov 2019 • Haoqi Li, Ming Tu, Jing Huang, Shrikanth Narayanan, Panayiotis Georgiou
In this paper, we propose a machine learning framework to obtain speech emotion representations by limiting the effect of speaker variability in the speech signals.
no code implementations • 23 Oct 2019 • Prashanth Gurunath Shivakumar, Naveen Kumar, Panayiotis Georgiou, Shrikanth Narayanan
We introduce and analyze different recurrent neural network architectures for incremental and online processing of the ASR transcripts and compare it to the existing offline systems.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+6
1 code implementation • 8 Oct 2019 • Haoqi Li, Brian Baucom, Panayiotis Georgiou
Further, we investigate the importance of emotional-context in the expression of behavior by constraining (or not) the neural networks' contextual view of the data.
1 code implementation • 10 Sep 2019 • Shao-Yen Tseng, Panayiotis Georgiou, Shrikanth Narayanan
Word embeddings such as ELMo have recently been shown to model word semantics with greater efficacy through contextualized learning on large-scale language corpora, resulting in significant improvement in state of the art across many natural language tasks.
no code implementations • 31 Aug 2019 • Prashanth Gurunath Shivakumar, Shao-Yen Tseng, Panayiotis Georgiou, Shrikanth Narayanan
In this work we derive motivation from psycholinguistics and propose the addition of behavioral information into the context of language modeling.
no code implementations • 2 Aug 2019 • Sandeep Nallan Chakravarthula, Haoqi Li, Shao-Yen Tseng, Maija Reblin, Panayiotis Georgiou
Cancer impacts the quality of life of those diagnosed as well as their spouse caregivers, in addition to potentially influencing their day-to-day behaviors.
no code implementations • 12 Apr 2019 • Md Nasir, Sandeep Nallan Chakravarthula, Brian Baucom, David C. Atkins, Panayiotis Georgiou, Shrikanth Narayanan
We find that our proposed measure is correlated with the therapist's empathy towards their patient in Motivational Interviewing and with affective behaviors in Couples Therapy.
1 code implementation • 7 Apr 2019 • Prashanth Gurunath Shivakumar, Mu Yang, Panayiotis Georgiou
In this paper, we address the spoken language intent detection under noisy conditions imposed by automatic speech recognition (ASR) systems.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 27 Nov 2018 • Tae Jin Park, Kyu Han, Ian Lane, Panayiotis Georgiou
This work presents a novel approach to leverage lexical information for speaker diarization.
no code implementations • 8 Nov 2018 • Prashanth Gurunath Shivakumar, Panayiotis Georgiou
In this paper, we propose a novel word vector representation, Confusion2Vec, motivated from the human speech production and perception that encodes representational ambiguity.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+3
no code implementations • 29 Oct 2018 • James Gibson, David C. Atkins, Torrey Creed, Zac Imel, Panayiotis Georgiou, Shrikanth Narayanan
We propose a methodology for estimating human behaviors in psychotherapy sessions using mutli-label and multi-task learning paradigms.
no code implementations • 18 Jul 2018 • Shao-Yen Tseng, Brian Baucom, Panayiotis Georgiou
Unsupervised learning has been an attractive method for easily deriving meaningful data representations from vast amounts of unlabeled data.
no code implementations • 23 May 2018 • Sandeep Nallan Chakravarthula, Brian Baucom, Panayiotis Georgiou
Dyadic interactions among humans are marked by speakers continuously influencing and reacting to each other in terms of responses and behaviors, among others.
no code implementations • 8 May 2018 • Prashanth Gurunath Shivakumar, Panayiotis Georgiou
Evaluations are presented on (i) comparisons of earlier GMM-HMM and the newer DNN Models, (ii) effectiveness of standard adaptation techniques versus transfer learning, (iii) various adaptation configurations in tackling the variabilities present in children speech, in terms of (a) acoustic spectral variability, and (b) pronunciation variability and linguistic constraints.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 23 Apr 2018 • Md Nasir, Brian Baucom, Shrikanth Narayanan, Panayiotis Georgiou
Entrainment is a known adaptation mechanism that causes interaction participants to adapt or synchronize their acoustic characteristics.
no code implementations • 22 Feb 2018 • Arindam Jati, Panayiotis Georgiou
Two sets of experiments are done in different scenarios to evaluate the strength of NPC embeddings and compare with state-of-the-art in-domain supervised methods.
1 code implementation • 7 Feb 2018 • Prashanth Gurunath Shivakumar, Haoqi Li, Kevin Knight, Panayiotis Georgiou
In this work we model ASR as a phrase-based noisy transformation channel and propose an error correction system that can learn from the aggregate errors of all the independent modules constituting the ASR and attempt to invert those.
Automatic Speech Recognition
Automatic Speech Recognition (ASR)
+2
no code implementations • 12 Jan 2017 • Haoqi Li, Brian Baucom, Panayiotis Georgiou
Behavioral annotation using signal processing and machine learning is highly dependent on training data and manual annotations of behavioral labels.
no code implementations • 14 Jun 2016 • Haoqi Li, Brian Baucom, Panayiotis Georgiou
We propose a Sparsely-Connected and Disjointly-Trained DNN (SD-DNN) framework to deal with limited data.