Search Results for author: Shrikanth Narayanan

Found 102 papers, 24 papers with code

Joint Estimation and Analysis of Risk Behavior Ratings in Movie Scripts

no code implementations EMNLP 2020 Victor Martinez, Krishna Somandepalli, Yalda Tehranian-Uhls, Shrikanth Narayanan

Exposure to violent, sexual, or substance-abuse content in media increases the willingness of children and adolescents to imitate similar behaviors.

Evaluating Atypical Gaze Patterns through Vision Models: The Case of Cortical Visual Impairment

no code implementations15 Feb 2024 Kleanthis Avramidis, Melinda Y. Chang, Rahul Sharma, Mark S. Borchert, Shrikanth Narayanan

A wide range of neurological and cognitive disorders exhibit distinct behavioral markers aside from their clinical manifestations.

Clinical Knowledge

Can Text-to-image Model Assist Multi-modal Learning for Visual Recognition with Visual Modality Missing?

no code implementations14 Feb 2024 Tiantian Feng, Daniel Yang, Digbalay Bose, Shrikanth Narayanan

Specifically, we propose a simple but effective multi-modal learning framework GTI-MM to enhance the data efficiency and model robustness against missing visual modality by imputing the missing data with generative transformers.

Explainable Severity ranking via pairwise n-hidden comparison: a case study of glaucoma

no code implementations5 Dec 2023 Hong Nguyen, Cuong V. Nguyen, Shrikanth Narayanan, Benjamin Y. Xu, Michael Pazzani

Primary open-angle glaucoma (POAG) is a chronic and progressive optic nerve condition that results in an acquired loss of optic nerve fibers and potential blindness.

Audio-visual child-adult speaker classification in dyadic interactions

no code implementations3 Oct 2023 Anfeng Xu, Kevin Huang, Tiantian Feng, Helen Tager-Flusberg, Shrikanth Narayanan

Building on the foundation of an audio-only child-adult speaker classification pipeline, we propose incorporating visual cues through active speaker detection and visual processing models.


Scaling Representation Learning from Ubiquitous ECG with State-Space Models

1 code implementation26 Sep 2023 Kleanthis Avramidis, Dominika Kunc, Bartosz Perz, Kranti Adsul, Tiantian Feng, Przemysław Kazienko, Stanisław Saganowski, Shrikanth Narayanan

We train this model in a self-supervised manner with 275, 000 10s ECG recordings collected in the wild and evaluate it on a range of downstream tasks.

Representation Learning

MM-AU:Towards Multimodal Understanding of Advertisement Videos

no code implementations27 Aug 2023 Digbalay Bose, Rajat Hebbar, Tiantian Feng, Krishna Somandepalli, Anfeng Xu, Shrikanth Narayanan

Advertisement videos (ads) play an integral part in the domain of Internet e-commerce as they amplify the reach of particular products to a broad audience or can serve as a medium to raise awareness about specific issues through concise narrative structures.

Robust Self Supervised Speech Embeddings for Child-Adult Classification in Interactions involving Children with Autism

no code implementations31 Jul 2023 Rimita Lahiri, Tiantian Feng, Rajat Hebbar, Catherine Lord, So Hyun Kim, Shrikanth Narayanan

We address the problem of detecting who spoke when in child-inclusive spoken interactions i. e., automatic child-adult speaker classification.


Learning Behavioral Representations of Routines From Large-scale Unlabeled Wearable Time-series Data Streams using Hawkes Point Process

no code implementations10 Jul 2023 Tiantian Feng, Brandon M Booth, Shrikanth Narayanan

In this work, we propose a novel wearable time-series mining framework, Hawkes point process On Time series clusters for ROutine Discovery (HOT-ROD), for uncovering behavioral routines from completely unlabeled wearable recordings.

Time Series

FedMultimodal: A Benchmark For Multimodal Federated Learning

no code implementations15 Jun 2023 Tiantian Feng, Digbalay Bose, Tuo Zhang, Rajat Hebbar, Anil Ramakrishna, Rahul Gupta, Mi Zhang, Salman Avestimehr, Shrikanth Narayanan

In order to facilitate the research in multimodal FL, we introduce FedMultimodal, the first FL benchmark for multimodal learning covering five representative multimodal applications from ten commonly used datasets with a total of eight unique modalities.

Emotion Recognition Federated Learning

Understanding Spoken Language Development of Children with ASD Using Pre-trained Speech Embeddings

no code implementations23 May 2023 Anfeng Xu, Rajat Hebbar, Rimita Lahiri, Tiantian Feng, Lindsay Butler, Lue Shen, Helen Tager-Flusberg, Shrikanth Narayanan

This paper proposes applications of speech processing technologies in support of automated assessment of children's spoken language development by classification between child and adult speech and between speech and nonverbal vocalization in NLS, with respective F1 macro scores of 82. 6% and 67. 8%, underscoring the potential for accurate and scalable tools for ASD research and clinical use.

Signal Processing Grand Challenge 2023 -- e-Prevention: Sleep Behavior as an Indicator of Relapses in Psychotic Patients

no code implementations17 Apr 2023 Kleanthis Avramidis, Kranti Adsul, Digbalay Bose, Shrikanth Narayanan

This paper presents the approach and results of USC SAIL's submission to the Signal Processing Grand Challenge 2023 - e-Prevention (Task 2), on detecting relapses in psychotic patients.

Task 2

Contextually-rich human affect perception using multimodal scene information

1 code implementation13 Mar 2023 Digbalay Bose, Rajat Hebbar, Krishna Somandepalli, Shrikanth Narayanan

The process of human affect understanding involves the ability to infer person specific emotional states from various sources including images, speech, and language.

Audio-Visual Activity Guided Cross-Modal Identity Association for Active Speaker Detection

1 code implementation1 Dec 2022 Rahul Sharma, Shrikanth Narayanan

Active speaker detection in videos addresses associating a source face, visible in the video frames, with the underlying speech in the audio modality.

Audio-Visual Active Speaker Detection

A Context-Aware Computational Approach for Measuring Vocal Entrainment in Dyadic Conversations

no code implementations7 Nov 2022 Rimita Lahiri, Md Nasir, Catherine Lord, So Hyun Kim, Shrikanth Narayanan

Vocal entrainment is a social adaptation mechanism in human interaction, knowledge of which can offer useful insights to an individual's cognitive-behavioral characteristics.

Using Emotion Embeddings to Transfer Knowledge Between Emotions, Languages, and Annotation Formats

1 code implementation31 Oct 2022 Georgios Chochlakis, Gireesh Mahajan, Sabyasachee Baruah, Keith Burghardt, Kristina Lerman, Shrikanth Narayanan

In this work, we study how we can build a single model that can transition between these different configurations by leveraging multilingual models and Demux, a transformer-based model whose input includes the emotions of interest, enabling us to dynamically change the emotions predicted by the model.

Emotion Recognition

Leveraging Label Correlations in a Multi-label Setting: A Case Study in Emotion

1 code implementation28 Oct 2022 Georgios Chochlakis, Gireesh Mahajan, Sabyasachee Baruah, Keith Burghardt, Kristina Lerman, Shrikanth Narayanan

First, we develop two modeling approaches to the problem in order to capture word associations of the emotion words themselves, by either including the emotions in the input, or by leveraging Masked Language Modeling (MLM).

Emotion Recognition Language Modelling +1

Multimodal Estimation of Change Points of Physiological Arousal in Drivers

1 code implementation28 Oct 2022 Kleanthis Avramidis, Tiantian Feng, Digbalay Bose, Shrikanth Narayanan

Detecting unsafe driving states, such as stress, drowsiness, and fatigue, is an important component of ensuring driving safety and an essential prerequisite for automatic intervention systems in vehicles.

Time Series Time Series Analysis

Leveraging Open Data and Task Augmentation to Automated Behavioral Coding of Psychotherapy Conversations in Low-Resource Scenarios

no code implementations25 Oct 2022 Zhuohao Chen, Nikolaos Flemotomos, Zac E. Imel, David C. Atkins, Shrikanth Narayanan

In psychotherapy interactions, the quality of a session is assessed by codifying the communicative behaviors of participants during the conversation through manual observation and annotation.

Language Modelling Meta-Learning

MovieCLIP: Visual Scene Recognition in Movies

1 code implementation20 Oct 2022 Digbalay Bose, Rajat Hebbar, Krishna Somandepalli, Haoyang Zhang, Yin Cui, Kree Cole-McLaughlin, Huisheng Wang, Shrikanth Narayanan

Longform media such as movies have complex narrative structures, with events spanning a rich variety of ambient visual scenes.

Genre classification Scene Recognition

Unsupervised active speaker detection in media content using cross-modal information

1 code implementation24 Sep 2022 Rahul Sharma, Shrikanth Narayanan

We leverage speaker identity information from speech and faces, and formulate active speaker detection as a speech-face assignment task such that the active speaker's face and the underlying speech identify the same person (character).

VAuLT: Augmenting the Vision-and-Language Transformer for Sentiment Classification on Social Media

1 code implementation18 Aug 2022 Georgios Chochlakis, Tejas Srinivasan, Jesse Thomason, Shrikanth Narayanan

VAuLT is an extension of the popular Vision-and-Language Transformer (ViLT), and improves performance on vision-and-language (VL) tasks that involve more complex text inputs than image captions while having minimal impact on training and inference efficiency.

Descriptive Image Captioning +4

Local dynamic mode of Cognitive Behavioral Therapy

no code implementations28 Apr 2022 Victor Ardulov, Torrey A. Creed, David C. Atkins, Shrikanth Narayanan

In order to increase mental health equity among the most vulnerable and marginalized communities, it is important to increase access to high-quality therapists.

Multimodal Clustering with Role Induced Constraints for Speaker Diarization

no code implementations1 Apr 2022 Nikolaos Flemotomos, Shrikanth Narayanan

Speaker clustering is an essential step in conventional speaker diarization systems and is typically addressed as an audio-only speech processing task.

Clustering speaker-diarization +1

Using Active Speaker Faces for Diarization in TV shows

no code implementations30 Mar 2022 Rahul Sharma, Shrikanth Narayanan

Speaker diarization is one of the critical components of computational media intelligence as it enables a character-level analysis of story portrayals and media content understanding.

Face Clustering Face Detection +2

Mel Frequency Spectral Domain Defenses against Adversarial Attacks on Speech Recognition Systems

no code implementations29 Mar 2022 Nicholas Mehlman, Anirudh Sreeram, Raghuveer Peri, Shrikanth Narayanan

A variety of recent works have looked into defenses for deep neural networks against adversarial attacks particularly within the image processing domain.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Audio visual character profiles for detecting background characters in entertainment media

no code implementations21 Mar 2022 Rahul Sharma, Shrikanth Narayanan

We curate a background character dataset which provides annotations for background character for a set of TV shows, and use it to evaluate the performance of the background character detection framework.

Active Speaker Localization Face Verification

To train or not to train adversarially: A study of bias mitigation strategies for speaker recognition

1 code implementation17 Mar 2022 Raghuveer Peri, Krishna Somandepalli, Shrikanth Narayanan

In this paper, we systematically evaluate the biases present in speaker recognition systems with respect to gender across a range of system operating points.

Face Recognition Fairness +2

Semi-FedSER: Semi-supervised Learning for Speech Emotion Recognition On Federated Learning using Multiview Pseudo-Labeling

1 code implementation15 Mar 2022 Tiantian Feng, Shrikanth Narayanan

In this work, we propose a semi-supervised federated learning framework, Semi-FedSER, that utilizes both labeled and unlabeled data samples to address the challenge of limited labeled data samples in FL.

Federated Learning Speech Emotion Recognition

Understanding of Emotion Perception from Art

no code implementations13 Oct 2021 Digbalay Bose, Krishna Somandepalli, Souvik Kundu, Rimita Lahiri, Jonathan Gratch, Shrikanth Narayanan

Computational modeling of the emotions evoked by art in humans is a challenging problem because of the subjective and nuanced nature of art and affective signals.

Cross Domain Emotion Recognition using Few Shot Knowledge Transfer

no code implementations11 Oct 2021 Justin Olah, Sabyasachee Baruah, Digbalay Bose, Shrikanth Narayanan

Emotion recognition from text is a challenging task due to diverse emotion taxonomies, lack of reliable labeled data in different domains, and highly subjective annotation standards.

Emotion Recognition Transfer Learning

Representation of professions in entertainment media: Insights into frequency and sentiment trends through computational text analysis

1 code implementation8 Oct 2021 Sabyasachee Baruah, Krishna Somandepalli, Shrikanth Narayanan

We analyze the frequency and sentiment trends of different occupations, study the effect of media attributes like genre, country of production, and title type on these trends, and investigate if the incidence of professions in media subtitles correlate with their real-world employment statistics.

Cultural Vocal Bursts Intensity Prediction Retrieval

Phone Duration Modeling for Speaker Age Estimation in Children

no code implementations3 Sep 2021 Prashanth Gurunath Shivakumar, Somer Bishop, Catherine Lord, Shrikanth Narayanan

In this paper, we propose features specific to children and focus on speaker's phone duration as an important biomarker of children's age.

Age Estimation regression

An Automated Quality Evaluation Framework of Psychotherapy Conversations with Local Quality Estimates

no code implementations15 Jun 2021 Zhuohao Chen, Nikolaos Flemotomos, Karan Singla, Torrey A. Creed, David C. Atkins, Shrikanth Narayanan

In particular, we model the global quality as a linear function of the local quality scores, which allows us to update the segment-level quality estimates based on the session-level quality prediction.

Acted vs. Improvised: Domain Adaptation for Elicitation Approaches in Audio-Visual Emotion Recognition

no code implementations5 Apr 2021 Haoqi Li, Yelin Kim, Cheng-Hao Kuo, Shrikanth Narayanan

Key challenges in developing generalized automatic emotion recognition systems include scarcity of labeled data and lack of gold-standard references.

Domain Adaptation Emotion Recognition +1

Unsupervised Speech Representation Learning for Behavior Modeling using Triplet Enhanced Contextualized Networks

no code implementations1 Apr 2021 Haoqi Li, Brian Baucom, Shrikanth Narayanan, Panayiotis Georgiou

In this paper, we exploit the stationary properties of human behavior within an interaction and present a representation learning method to capture behavioral information from speech in an unsupervised way.

Representation Learning

Front-end Diarization for Percussion Separation in Taniavartanam of Carnatic Music Concerts

no code implementations4 Mar 2021 Nauman Dawalatabad, Jilt Sebastian, Jom Kuriakose, C. Chandra Sekhar, Shrikanth Narayanan, Hema A. Murthy

In this work, we address the problem of separating the percussive voices in the taniavartanam segments of Carnatic music.

Automated Quality Assessment of Cognitive Behavioral Therapy Sessions Through Highly Contextualized Language Representations

no code implementations23 Feb 2021 Nikolaos Flemotomos, Victor R. Martinez, Zhuohao Chen, Torrey A. Creed, David C. Atkins, Shrikanth Narayanan

In this work, we propose a BERT-based model for automatic behavioral scoring of a specific type of psychotherapy, called Cognitive Behavioral Therapy (CBT), where prior work is limited to frequency-based language features and/or short text excerpts which do not capture the unique elements involved in a spontaneous long conversational interaction.

Binary Classification

Automated Evaluation Of Psychotherapy Skills Using Speech And Language Technologies

no code implementations22 Feb 2021 Nikolaos Flemotomos, Victor R. Martinez, Zhuohao Chen, Karan Singla, Victor Ardulov, Raghuveer Peri, Derek D. Caperton, James Gibson, Michael J. Tanana, Panayiotis Georgiou, Jake Van Epps, Sarah P. Lord, Tad Hirsch, Zac E. Imel, David C. Atkins, Shrikanth Narayanan

With the growing prevalence of psychological interventions, it is vital to have measures which rate the effectiveness of psychological care to assist in training, supervision, and quality assurance of services.

End-to-End Neural Systems for Automatic Children Speech Recognition: An Empirical Study

no code implementations19 Feb 2021 Prashanth Gurunath Shivakumar, Shrikanth Narayanan

A key desiderata for inclusive and accessible speech recognition technology is ensuring its robust performance to children's speech.

speech-recognition Speech Recognition

Confusion2vec 2.0: Enriching Ambiguous Spoken Language Representations with Subwords

1 code implementation3 Feb 2021 Prashanth Gurunath Shivakumar, Panayiotis Georgiou, Shrikanth Narayanan

Confusion2vec, motivated from human speech production and perception, is a word vector representation which encodes ambiguities present in human spoken language in addition to semantics and syntactic information.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +5

A Review of Speaker Diarization: Recent Advances with Deep Learning

no code implementations24 Jan 2021 Tae Jin Park, Naoyuki Kanda, Dimitrios Dimitriadis, Kyu J. Han, Shinji Watanabe, Shrikanth Narayanan

Speaker diarization is a task to label audio or video recordings with classes that correspond to speaker identity, or in short, a task to identify "who spoke when".

Retrieval speaker-diarization +3

Robust Character Labeling in Movie Videos: Data Resources and Self-supervised Feature Adaptation

no code implementations25 Aug 2020 Krishna Somandepalli, Rajat Hebbar, Shrikanth Narayanan

Our work in this paper focuses on two key aspects of this problem: the lack of domain-specific training or benchmark datasets, and adapting face embeddings learned on web images to long-form content, specifically movies.

Clustering Domain Adaptation +2

Victim or Perpetrator? Analysis of Violent Characters Portrayals from Movie Scripts

no code implementations19 Aug 2020 Victor R. Martinez, Krishna Somandepalli, Karan Singla, Anil Ramanakrishna, Yalda T. Uhls, Shrikanth Narayanan

To date, we are the first to show that language used in movie scripts is a strong indicator of violent content, and that there are systematic portrayals of certain demographics as victims and perpetrators in a large dataset.

Adversarial Attack and Defense Strategies for Deep Speaker Recognition Systems

1 code implementation18 Aug 2020 Arindam Jati, Chin-Cheng Hsu, Monisankha Pal, Raghuveer Peri, Wael Abd-Almageed, Shrikanth Narayanan

Robust speaker recognition, including in the presence of malicious attacks, is becoming increasingly important and essential, especially due to the proliferation of several smart speakers and personal agents that interact with an individual's voice commands to perform diverse, and even sensitive tasks.

Adversarial Attack Adversarial Robustness +1

Designing Neural Speaker Embeddings with Meta Learning

1 code implementation31 Jul 2020 Manoj Kumar, Tae Jin-Park, Somer Bishop, Shrikanth Narayanan

Our experiments illustrate the applicability of meta-learning as a generalized learning paradigm for training deep neural speaker embeddings.

Audio and Speech Processing Sound

Evidence of Task-Independent Person-Specific Signatures in EEG using Subspace Techniques

no code implementations27 Jul 2020 Mari Ganesh Kumar, Shrikanth Narayanan, Mriganka Sur, Hema A. Murthy

These high dimensional statistics are then projected to a lower dimensional space where the biometric information is preserved.

EEG Electroencephalogram (EEG) +2

Towards end-2-end learning for predicting behavior codes from spoken utterances in psychotherapy conversations

no code implementations ACL 2020 Karan Singla, Zhuohao Chen, David Atkins, Shrikanth Narayanan

Spoken language understanding tasks usually rely on pipelines involving complex processing blocks such as voice activity detection, speaker diarization and Automatic speech recognition (ASR).

Action Detection Activity Detection +6

Screenplay Quality Assessment: Can We Predict Who Gets Nominated?

no code implementations WS 2020 Ming-Chang Chiu, Tiantian Feng, Xiang Ren, Shrikanth Narayanan

Toward that goal, in this work, we present a method to evaluate the quality of a screenplay based on linguistic cues.

Generalized Multi-view Shared Subspace Learning using View Bootstrapping

no code implementations12 May 2020 Krishna Somandepalli, Shrikanth Narayanan

A key objective in multi-view learning is to model the information common to multiple parallel views of a class of objects/events to improve downstream learning tasks.

3D Object Classification Face Recognition +2

Joint Multi-Dimensional Model for Global and Time-Series Annotations

no code implementations6 May 2020 Anil Ramakrishna, Rahul Gupta, Shrikanth Narayanan

In this work we address this by proposing a generative model for multi-dimensional annotation fusion, which models the dimensions jointly leading to more accurate ground truth estimates.

Time Series Time Series Analysis

Speaker Diarization with Lexical Information

no code implementations13 Apr 2020 Tae Jin Park, Kyu J. Han, Jing Huang, Xiaodong He, Bo-Wen Zhou, Panayiotis Georgiou, Shrikanth Narayanan

This work presents a novel approach for speaker diarization to leverage lexical information provided by automatic speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

TILES-2018, a longitudinal physiologic and behavioral data set of hospital workers

no code implementations18 Mar 2020 Karel Mundnich, Brandon M. Booth, Michelle L'Hommedieu, Tiantian Feng, Benjamin Girault, Justin L'Hommedieu, Mackenzie Wildman, Sophia Skaaden, Amrutha Nadarajan, Jennifer L. Villatte, Tiago H. Falk, Kristina Lerman, Emilio Ferrara, Shrikanth Narayanan

We designed the study to investigate the use of off-the-shelf wearable and environmental sensors to understand individual-specific constructs such as job performance, interpersonal interaction, and well-being of hospital workers over time in their natural day-to-day job settings.

Privacy Preserving

A Label Proportions Estimation Technique for Adversarial Domain Adaptation in Text Classification

no code implementations16 Mar 2020 Zhuohao Chen, Singla Karan, David C. Atkins, Zac E. Imel, Shrikanth Narayanan

The DAN-LPE simultaneously trains a domain adversarial net and processes label proportions estimation by the confusion of the source domain and the predictions of the target domain.

General Classification text-classification +2

Cross modal video representations for weakly supervised active speaker localization

no code implementations9 Mar 2020 Rahul Sharma, Krishna Somandepalli, Shrikanth Narayanan

Avoiding the need for manual annotations for active speakers in visual frames, acquiring of which is very expensive, we present a weakly supervised system for the task of localizing active speakers in movie content.

Action Detection Active Speaker Localization +2

Auto-Tuning Spectral Clustering for Speaker Diarization Using Normalized Maximum Eigengap

1 code implementation5 Mar 2020 Tae Jin Park, Kyu J. Han, Manoj Kumar, Shrikanth Narayanan

In this study, we propose a new spectral clustering framework that can auto-tune the parameters of the clustering algorithm in the context of speaker diarization.

 Ranked #1 on Speaker Diarization on CALLHOME (DER(ig olp) metric)

Clustering speaker-diarization +1

An analysis of observation length requirements for machine understanding of human behaviors from spoken language

no code implementations21 Nov 2019 Sandeep Nallan Chakravarthula, Brian Baucom, Shrikanth Narayanan, Panayiotis Georgiou

In this paper, we investigate this link and present an analysis framework that determines appropriate window lengths for the task of behavior estimation.

Learning Behavioral Representations from Wearable Sensors

no code implementations16 Nov 2019 Nazgol Tavabi, Homa Hosseinmardi, Jennifer L. Villatte, Andrés Abeliuk, Shrikanth Narayanan, Emilio Ferrara, Kristina Lerman

Continuous collection of physiological data from wearable sensors enables temporal characterization of individual behaviors.

Characterizing dynamically varying acoustic scenes from egocentric audio recordings in workplace setting

no code implementations10 Nov 2019 Arindam Jati, Amrutha Nadarajan, Karel Mundnich, Shrikanth Narayanan

In this paper, we address the task of characterizing acoustic scenes in a workplace setting from audio recordings collected with wearable microphones.

Acoustic Scene Classification General Classification +1

Speaker-invariant Affective Representation Learning via Adversarial Training

no code implementations4 Nov 2019 Haoqi Li, Ming Tu, Jing Huang, Shrikanth Narayanan, Panayiotis Georgiou

In this paper, we propose a machine learning framework to obtain speech emotion representations by limiting the effect of speaker variability in the speech signals.

Emotion Classification Representation Learning +1

Robust speaker recognition using unsupervised adversarial invariance

1 code implementation3 Nov 2019 Raghuveer Peri, Monisankha Pal, Arindam Jati, Krishna Somandepalli, Shrikanth Narayanan

In this paper, we address the problem of speaker recognition in challenging acoustic conditions using a novel method to extract robust speaker-discriminative speech representations.

speaker-diarization Speaker Diarization +2

Learning Domain Invariant Representations for Child-Adult Classification from Speech

no code implementations25 Oct 2019 Rimita Lahiri, Manoj Kumar, Somer Bishop, Shrikanth Narayanan

Diagnostic procedures for ASD (autism spectrum disorder) involve semi-naturalistic interactions between the child and a clinician.

Binary Classification General Classification

RNN based Incremental Online Spoken Language Understanding

no code implementations23 Oct 2019 Prashanth Gurunath Shivakumar, Naveen Kumar, Panayiotis Georgiou, Shrikanth Narayanan

We introduce and analyze different recurrent neural network architectures for incremental and online processing of the ASR transcripts and compare it to the existing offline systems.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +8

Multimodal Embeddings from Language Models

1 code implementation10 Sep 2019 Shao-Yen Tseng, Panayiotis Georgiou, Shrikanth Narayanan

Word embeddings such as ELMo have recently been shown to model word semantics with greater efficacy through contextualized learning on large-scale language corpora, resulting in significant improvement in state of the art across many natural language tasks.

Emotion Recognition Language Modelling +1

The Ambiguous World of Emotion Representation

no code implementations1 Sep 2019 Vidhyasaharan Sethu, Emily Mower Provost, Julien Epps, Carlos Busso, NIcholas Cummins, Shrikanth Narayanan

A key reason for this is the lack of a common mathematical framework to describe all the relevant elements of emotion representations.

Face Recognition Speaker Verification +2

Behavior Gated Language Models

no code implementations31 Aug 2019 Prashanth Gurunath Shivakumar, Shao-Yen Tseng, Panayiotis Georgiou, Shrikanth Narayanan

In this work we derive motivation from psycholinguistics and propose the addition of behavioral information into the context of language modeling.

Language Modelling

Modeling Interpersonal Linguistic Coordination in Conversations using Word Mover's Distance

no code implementations12 Apr 2019 Md Nasir, Sandeep Nallan Chakravarthula, Brian Baucom, David C. Atkins, Panayiotis Georgiou, Shrikanth Narayanan

We find that our proposed measure is correlated with the therapist's empathy towards their patient in Motivational Interviewing and with affective behaviors in Couples Therapy.

Multimodal Representation Learning using Deep Multiset Canonical Correlation

1 code implementation3 Apr 2019 Krishna Somandepalli, Naveen Kumar, Ruchir Travadi, Shrikanth Narayanan

We propose Deep Multiset Canonical Correlation Analysis (dMCCA) as an extension to representation learning using CCA when the underlying signal is observed across multiple (more than two) modalities.

Representation Learning

On evaluating CNN representations for low resource medical image classification

no code implementations26 Mar 2019 Taruna Agrawal, Rahul Gupta, Shrikanth Narayanan

Convolutional Neural Networks (CNNs) have revolutionized performances in several machine learning tasks such as image classification, object tracking, and keyword spotting.

General Classification Image Classification +5

Multi-label Multi-task Deep Learning for Behavioral Coding

no code implementations29 Oct 2018 James Gibson, David C. Atkins, Torrey Creed, Zac Imel, Panayiotis Georgiou, Shrikanth Narayanan

We propose a methodology for estimating human behaviors in psychotherapy sessions using mutli-label and multi-task learning paradigms.

Multi-Task Learning

Tensor Embedding: A Supervised Framework for Human Behavioral Data Mining and Prediction

no code implementations31 Aug 2018 Homa Hosseinmardi, Amir Ghasemian, Shrikanth Narayanan, Kristina Lerman, Emilio Ferrara

Today's densely instrumented world offers tremendous opportunities for continuous acquisition and analysis of multimodal sensor data providing temporal characterization of an individual's behaviors.

Towards an Unsupervised Entrainment Distance in Conversational Speech using Deep Neural Networks

no code implementations23 Apr 2018 Md Nasir, Brian Baucom, Shrikanth Narayanan, Panayiotis Georgiou

Entrainment is a known adaptation mechanism that causes interaction participants to adapt or synchronize their acoustic characteristics.

Linguistic analysis of differences in portrayal of movie characters

no code implementations ACL 2017 Anil Ramakrishna, Victor R. Mart{\'\i}nez, Mal, Nikolaos rakis, Karan Singla, Shrikanth Narayanan

We examine differences in portrayal of characters in movies using psycholinguistic and graph theoretic measures computed directly from screenplays.


Inferring object rankings based on noisy pairwise comparisons from multiple annotators

no code implementations13 Dec 2016 Rahul Gupta, Shrikanth Narayanan

In this work, we propose Expectation-Maximization (EM) based algorithms that rely on the judgments from multiple annotators and the object attributes for inferring the latent ground truth.


The Twins Corpus of Museum Visitor Questions

no code implementations LREC 2012 Priti Aggarwal, Ron artstein, Jillian Gerten, Athanasios Katsamanis, Shrikanth Narayanan, Angela Nazarian, David Traum

In addition to speech recordings, the corpus contains the outputs of speech recognition performed at the time of utterance as well as the system interpretation of the utterances.

Dialogue Management Natural Language Understanding +3

Cannot find the paper you are looking for? You can Submit a new open access paper.