no code implementations • 14 Sep 2024 • Tiantian Feng, Anfeng Xu, Xuan Shi, Somer Bishop, Shrikanth Narayanan
In this study, we design an experiment to perform speech sampling in BOSCC interviews from an egocentric perspective using wearable sensors and explore pre-training Ego4D speech samples to enhance child-adult speaker classification in dyadic interactions.
1 code implementation • 13 Sep 2024 • Anfeng Xu, Tiantian Feng, Helen Tager-Flusberg, Catherine Lord, Shrikanth Narayanan
We then train a Whisper Encoder-based model, achieving strong zero-shot performance on child-adult speaker diarization using real datasets.
no code implementations • 28 Aug 2024 • Tiantian Feng, Tuo Zhang, Salman Avestimehr, Shrikanth S. Narayanan
Multimodal Federated Learning frequently encounters challenges of client modality heterogeneity, leading to undesired performances for secondary modality in multimodal learning.
1 code implementation • 14 Jun 2024 • Tuo Zhang, Tiantian Feng, Yibin Ni, Mengqin Cao, Ruying Liu, Katharine Butler, Yanjun Weng, Mi Zhang, Shrikanth S. Narayanan, Salman Avestimehr
Large vision-language models (VLMs) have demonstrated remarkable abilities in understanding everyday content.
1 code implementation • 13 Jun 2024 • Tiantian Feng, Dimitrios Dimitriadis, Shrikanth Narayanan
In contrast, we aim to evaluate the quality of audio generation by examining the effectiveness of using them as training data.
1 code implementation • 12 Jun 2024 • JIhwan Lee, Aditya Kommineni, Tiantian Feng, Kleanthis Avramidis, Xuan Shi, Sudarsana Kadiri, Shrikanth Narayanan
Speech decoding from EEG signals is a challenging task, where brain activity is modeled to estimate salient characteristics of acoustic stimuli.
1 code implementation • 12 Jun 2024 • Anfeng Xu, Kevin Huang, Tiantian Feng, Lue Shen, Helen Tager-Flusberg, Shrikanth Narayanan
Speech foundation models, trained on vast datasets, have opened unique opportunities in addressing challenging low-resource speech understanding, such as child speech.
no code implementations • 13 May 2024 • Chang Huang, Junqiao Zhao, Shatong Zhu, Hongtu Zhou, Chen Ye, Tiantian Feng, Changjun Jiang
Value function factorization methods are commonly used in cooperative multi-agent reinforcement learning, with QMIX receiving significant attention.
Multi-agent Reinforcement Learning reinforcement-learning +2
no code implementations • 27 Apr 2024 • Tiantian Feng, Xuan Shi, Rahul Gupta, Shrikanth S. Narayanan
Automatic Speech Understanding (ASU) aims at human-like speech interpretation, providing nuanced intent, emotion, sentiment, and content understanding from speech and language (text) content conveyed in speech.
no code implementations • 3 Mar 2024 • Tiantian Feng, Anil Ramakrishna, Jimit Majmudar, Charith Peris, Jixuan Wang, Clement Chung, Richard Zemel, Morteza Ziyadi, Rahul Gupta
Federated Learning (FL) is a popular algorithm to train machine learning models on user data constrained to edge devices (for example, mobile phones) due to privacy concerns.
no code implementations • 14 Feb 2024 • Tiantian Feng, Daniel Yang, Digbalay Bose, Shrikanth Narayanan
Specifically, we propose a simple but effective multi-modal learning framework GTI-MM to enhance the data efficiency and model robustness against missing visual modality by imputing the missing data with generative transformers.
no code implementations • 3 Oct 2023 • Anfeng Xu, Kevin Huang, Tiantian Feng, Helen Tager-Flusberg, Shrikanth Narayanan
Building on the foundation of an audio-only child-adult speaker classification pipeline, we propose incorporating visual cues through active speaker detection and visual processing models.
1 code implementation • 29 Sep 2023 • Samiul Alam, Tuo Zhang, Tiantian Feng, Hui Shen, Zhichao Cao, Dong Zhao, JeongGil Ko, Kiran Somasundaram, Shrikanth S. Narayanan, Salman Avestimehr, Mi Zhang
However, most existing FL works do not use datasets collected from authentic IoT devices and thus do not capture unique modalities and inherent challenges of IoT data.
1 code implementation • 26 Sep 2023 • Kleanthis Avramidis, Dominika Kunc, Bartosz Perz, Kranti Adsul, Tiantian Feng, Przemysław Kazienko, Stanisław Saganowski, Shrikanth Narayanan
We train this model in a self-supervised manner with 275, 000 10s ECG recordings collected in the wild and evaluate it on a range of downstream tasks.
no code implementations • 27 Aug 2023 • Digbalay Bose, Rajat Hebbar, Tiantian Feng, Krishna Somandepalli, Anfeng Xu, Shrikanth Narayanan
Advertisement videos (ads) play an integral part in the domain of Internet e-commerce as they amplify the reach of particular products to a broad audience or can serve as a medium to raise awareness about specific issues through concise narrative structures.
no code implementations • 31 Jul 2023 • Rimita Lahiri, Tiantian Feng, Rajat Hebbar, Catherine Lord, So Hyun Kim, Shrikanth Narayanan
We address the problem of detecting who spoke when in child-inclusive spoken interactions i. e., automatic child-adult speaker classification.
no code implementations • 10 Jul 2023 • Tiantian Feng, Brandon M Booth, Shrikanth Narayanan
In this work, we propose a novel wearable time-series mining framework, Hawkes point process On Time series clusters for ROutine Discovery (HOT-ROD), for uncovering behavioral routines from completely unlabeled wearable recordings.
no code implementations • 15 Jun 2023 • Tiantian Feng, Digbalay Bose, Tuo Zhang, Rajat Hebbar, Anil Ramakrishna, Rahul Gupta, Mi Zhang, Salman Avestimehr, Shrikanth Narayanan
In order to facilitate the research in multimodal FL, we introduce FedMultimodal, the first FL benchmark for multimodal learning covering five representative multimodal applications from ten commonly used datasets with a total of eight unique modalities.
1 code implementation • 3 Jun 2023 • Tuo Zhang, Tiantian Feng, Samiul Alam, Dimitrios Dimitriadis, Sunwoo Lee, Mi Zhang, Shrikanth S. Narayanan, Salman Avestimehr
Through comprehensive ablation analysis across various data modalities, we discover that the downstream model generated by synthetic data plays a crucial role in controlling the direction of gradient diversity during FL training, which enhances convergence speed and contributes to the notable accuracy boost observed with GPT-FL.
no code implementations • 23 May 2023 • Anfeng Xu, Rajat Hebbar, Rimita Lahiri, Tiantian Feng, Lindsay Butler, Lue Shen, Helen Tager-Flusberg, Shrikanth Narayanan
This paper proposes applications of speech processing technologies in support of automated assessment of children's spoken language development by classification between child and adult speech and between speech and nonverbal vocalization in NLS, with respective F1 macro scores of 82. 6% and 67. 8%, underscoring the potential for accurate and scalable tools for ASD research and clinical use.
1 code implementation • 19 May 2023 • Junqiao Zhao, Fenglin Zhang, Yingfeng Cai, Gengxuan Tian, Wenjie Mu, Chen Ye, Tiantian Feng
Visual Place Recognition (VPR) aims to retrieve frames from a geotagged database that are located at the same place as the query frame.
no code implementations • 18 Dec 2022 • Tiantian Feng, Rajat Hebbar, Nicholas Mehlman, Xuan Shi, Aditya Kommineni, and Shrikanth Narayanan
Speech-centric machine learning systems have revolutionized many leading domains ranging from transportation and healthcare to education and defense, profoundly changing how people live, work, and interact with each other.
1 code implementation • 28 Oct 2022 • Kleanthis Avramidis, Tiantian Feng, Digbalay Bose, Shrikanth Narayanan
Detecting unsafe driving states, such as stress, drowsiness, and fatigue, is an important component of ensuring driving safety and an essential prerequisite for automatic intervention systems in vehicles.
1 code implementation • 15 Mar 2022 • Tiantian Feng, Shrikanth Narayanan
In this work, we propose a semi-supervised federated learning framework, Semi-FedSER, that utilizes both labeled and unlabeled data samples to address the challenge of limited labeled data samples in FL.
no code implementations • 11 Feb 2022 • Yingfeng Cai, Junqiao Zhao, Jiafeng Cui, Fenglin Zhang, Chen Ye, Tiantian Feng
Visual Place Recognition (VPR) in areas with similar scenes such as urban or indoor scenarios is a major challenge.
1 code implementation • 26 Dec 2021 • Tiantian Feng, Hanieh Hashemi, Rajat Hebbar, Murali Annavaram, Shrikanth S. Narayanan
To assess the information leakage of SER systems trained using FL, we propose an attribute inference attack framework that infers sensitive attribute information of the clients from shared gradients or model parameters, corresponding to the FedSGD and the FedAvg training algorithms, respectively.
1 code implementation • 30 May 2021 • Jianfeng Li, Junqiao Zhao, Shuangfu Song, Tiantian Feng
Compared with independent training, joint training can make full use of the geometric relationship between geometric elements and provide dynamic and static information of the scene.
no code implementations • WS 2020 • Ming-Chang Chiu, Tiantian Feng, Xiang Ren, Shrikanth Narayanan
Toward that goal, in this work, we present a method to evaluate the quality of a screenplay based on linguistic cues.
no code implementations • 18 Mar 2020 • Karel Mundnich, Brandon M. Booth, Michelle L'Hommedieu, Tiantian Feng, Benjamin Girault, Justin L'Hommedieu, Mackenzie Wildman, Sophia Skaaden, Amrutha Nadarajan, Jennifer L. Villatte, Tiago H. Falk, Kristina Lerman, Emilio Ferrara, Shrikanth Narayanan
We designed the study to investigate the use of off-the-shelf wearable and environmental sensors to understand individual-specific constructs such as job performance, interpersonal interaction, and well-being of hospital workers over time in their natural day-to-day job settings.
1 code implementation • 4 Mar 2020 • Jianfeng Li, Junqiao Zhao, Tiantian Feng, Chen Ye, Lu Xiong
In this paper, we proposed an unsupervised learning method for estimating the optical flow between video frames, especially to solve the occlusion problem.
no code implementations • 26 Sep 2018 • Yewei Huang, Junqiao Zhao, Xudong He, Shaoming Zhang, Tiantian Feng
In this paper, we proposed a novel and practical solution for the real-time indoor localization of autonomous driving in parking lots.