Search Results for author: Yiteng Huang

Found 10 papers, 0 papers with code

MASV: Speaker Verification with Global and Local Context Mamba

no code implementations14 Dec 2024 Yang Liu, Li Wan, Yiteng Huang, Ming Sun, Yangyang Shi, Florian Metze

Deep learning models like Convolutional Neural Networks and transformers have shown impressive capabilities in speech verification, gaining considerable attention in the research community.

Mamba Speaker Verification

M-BEST-RQ: A Multi-Channel Speech Foundation Model for Smart Glasses

no code implementations17 Sep 2024 Yufeng Yang, Desh Raj, Ju Lin, Niko Moritz, Junteng Jia, Gil Keren, Egor Lakomkin, Yiteng Huang, Jacob Donley, Jay Mahadeokar, Ozlem Kalinli

For the conversational ASR task in particular, using only 8 hours of labeled speech, our model outperforms a supervised ASR baseline that is trained on 2000 hours of labeled data, which demonstrates the effectiveness of our approach.

Action Detection Activity Detection +4

Effective Integration of KAN for Keyword Spotting

no code implementations13 Sep 2024 Anfeng Xu, Biqiao Zhang, Shuyu Kong, Yiteng Huang, Zhaojun Yang, Sangeeta Srivastava, Ming Sun

Keyword spotting (KWS) is an important speech processing component for smart devices with voice assistance capability.

Keyword Spotting Kolmogorov-Arnold Networks

Disentangled Training with Adversarial Examples For Robust Small-footprint Keyword Spotting

no code implementations23 Aug 2024 Zhenyu Wang, Li Wan, Biqiao Zhang, Yiteng Huang, Shang-Wen Li, Ming Sun, Xin Lei, Zhaojun Yang

A keyword spotting (KWS) engine that is continuously running on device is exposed to various speech signals that are usually unseen before.

Small-Footprint Keyword Spotting

AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition

no code implementations18 Jan 2024 Ju Lin, Niko Moritz, Yiteng Huang, Ruiming Xie, Ming Sun, Christian Fuegen, Frank Seide

Wearable devices like smart glasses are approaching the compute capability to seamlessly generate real-time closed captions for live conversations.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

FADI-AEC: Fast Score Based Diffusion Model Guided by Far-end Signal for Acoustic Echo Cancellation

no code implementations8 Jan 2024 Yang Liu, Li Wan, Yun Li, Yiteng Huang, Ming Sun, James Luan, Yangyang Shi, Xin Lei

Despite the potential of diffusion models in speech enhancement, their deployment in Acoustic Echo Cancellation (AEC) has been restricted.

Acoustic echo cancellation Speech Enhancement

Handling the Alignment for Wake Word Detection: A Comparison Between Alignment-Based, Alignment-Free and Hybrid Approaches

no code implementations17 Feb 2023 Vinicius Ribeiro, Yiteng Huang, Yuan Shangguan, Zhaojun Yang, Li Wan, Ming Sun

The third, proposed by us, is a hybrid solution in which the model is trained with a small set of aligned data and then tuned with a sizeable unaligned dataset.

Cannot find the paper you are looking for? You can Submit a new open access paper.