Search Results for author: Jinhan Wang

Found 8 papers, 2 papers with code

Turn-taking and Backchannel Prediction with Acoustic and Large Language Model Fusion

no code implementations26 Jan 2024 Jinhan Wang, Long Chen, Aparna Khare, Anirudh Raju, Pranav Dheram, Di He, Minhua Wu, Andreas Stolcke, Venkatesh Ravichandran

We propose an approach for continuous prediction of turn-taking and backchanneling locations in spoken dialogue by fusing a neural acoustic model with a large language model (LLM).

Language Modelling Large Language Model

Towards Better Domain Adaptation for Self-supervised Models: A Case Study of Child ASR

1 code implementation28 Apr 2023 Ruchao Fan, Yunzheng Zhu, Jinhan Wang, Abeer Alwan

With the proposed methods (E-APC and DRAFT), the relative WER improvements are even larger (30% and 19% on the OGI and MyST data, respectively) when compared to the models without using pretraining methods.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

Unsupervised Instance Discriminative Learning for Depression Detection from Speech Signals

no code implementations27 Jun 2022 Jinhan Wang, Vijay Ravi, Jonathan Flint, Abeer Alwan

To learn instance-spread-out embeddings, we explore methods for sampling instances for a training batch (distinct speaker-based and random sampling).

Data Augmentation Depression Detection +1

VADOI:Voice-Activity-Detection Overlapping Inference For End-to-end Long-form Speech Recognition

no code implementations22 Feb 2022 Jinhan Wang, Xiaosu Tong, Jinxi Guo, Di He, Roland Maas

Results show that the proposed method can achieve a 20% relative computation cost reduction on Librispeech and Microsoft Speech Language Translation long-form corpus while maintaining the WER performance when comparing to the best performing overlapping inference algorithm.

Action Detection Activity Detection +3

FrAUG: A Frame Rate Based Data Augmentation Method for Depression Detection from Speech Signals

no code implementations11 Feb 2022 Vijay Ravi, Jinhan Wang, Jonathan Flint, Abeer Alwan

The improvements for the CONVERGE (Mandarin) dataset when using the x-vector embeddings with CNN as the backend and MFCCs as input features were 9. 32% (validation) and 12. 99% (test).

Data Augmentation Depression Detection

Cannot find the paper you are looking for? You can Submit a new open access paper.