no code implementations • 24 Jan 2024 • Benjamin A. T. Grahama, Lauren Brown, Georgios Chochlakis, Morteza Dehghani, Raquel Delerme, Brittany Friedman, Ellie Graeden, Preni Golazizian, Rajat Hebbar, Parsa Hejabi, Aditya Kommineni, Mayagüez Salinas, Michael Sierra-Arévalo, Jackson Trager, Nicholas Weller, Shrikanth Narayanan
Interactions between the government officials and civilians affect public wellbeing and the state legitimacy that is necessary for the functioning of democratic society.
no code implementations • 27 Aug 2023 • Digbalay Bose, Rajat Hebbar, Tiantian Feng, Krishna Somandepalli, Anfeng Xu, Shrikanth Narayanan
Advertisement videos (ads) play an integral part in the domain of Internet e-commerce as they amplify the reach of particular products to a broad audience or can serve as a medium to raise awareness about specific issues through concise narrative structures.
no code implementations • 31 Jul 2023 • Rimita Lahiri, Tiantian Feng, Rajat Hebbar, Catherine Lord, So Hyun Kim, Shrikanth Narayanan
We address the problem of detecting who spoke when in child-inclusive spoken interactions i. e., automatic child-adult speaker classification.
no code implementations • 15 Jun 2023 • Tiantian Feng, Digbalay Bose, Tuo Zhang, Rajat Hebbar, Anil Ramakrishna, Rahul Gupta, Mi Zhang, Salman Avestimehr, Shrikanth Narayanan
In order to facilitate the research in multimodal FL, we introduce FedMultimodal, the first FL benchmark for multimodal learning covering five representative multimodal applications from ten commonly used datasets with a total of eight unique modalities.
no code implementations • 23 May 2023 • Anfeng Xu, Rajat Hebbar, Rimita Lahiri, Tiantian Feng, Lindsay Butler, Lue Shen, Helen Tager-Flusberg, Shrikanth Narayanan
This paper proposes applications of speech processing technologies in support of automated assessment of children's spoken language development by classification between child and adult speech and between speech and nonverbal vocalization in NLS, with respective F1 macro scores of 82. 6% and 67. 8%, underscoring the potential for accurate and scalable tools for ASD research and clinical use.
1 code implementation • 13 Mar 2023 • Digbalay Bose, Rajat Hebbar, Krishna Somandepalli, Shrikanth Narayanan
The process of human affect understanding involves the ability to infer person specific emotional states from various sources including images, speech, and language.
1 code implementation • 14 Feb 2023 • Rajat Hebbar, Digbalay Bose, Krishna Somandepalli, Veena Vijai, Shrikanth Narayanan
In this work, we present a dataset of audio events called Subtitle-Aligned Movie Sounds (SAM-S).
no code implementations • 18 Dec 2022 • Tiantian Feng, Rajat Hebbar, Nicholas Mehlman, Xuan Shi, Aditya Kommineni, and Shrikanth Narayanan
Speech-centric machine learning systems have revolutionized many leading domains ranging from transportation and healthcare to education and defense, profoundly changing how people live, work, and interact with each other.
1 code implementation • 20 Oct 2022 • Digbalay Bose, Rajat Hebbar, Krishna Somandepalli, Haoyang Zhang, Yin Cui, Kree Cole-McLaughlin, Huisheng Wang, Shrikanth Narayanan
Longform media such as movies have complex narrative structures, with events spanning a rich variety of ambient visual scenes.
1 code implementation • 26 Dec 2021 • Tiantian Feng, Hanieh Hashemi, Rajat Hebbar, Murali Annavaram, Shrikanth S. Narayanan
To assess the information leakage of SER systems trained using FL, we propose an attribute inference attack framework that infers sensitive attribute information of the clients from shared gradients or model parameters, corresponding to the FedSGD and the FedAvg training algorithms, respectively.
no code implementations • 25 Aug 2020 • Krishna Somandepalli, Rajat Hebbar, Shrikanth Narayanan
Our work in this paper focuses on two key aspects of this problem: the lack of domain-specific training or benchmark datasets, and adapting face embeddings learned on web images to long-form content, specifically movies.