Search Results for author: Hu Hu

Found 16 papers, 7 papers with code

Bayesian adaptive learning to latent variables via Variational Bayes and Maximum a Posteriori

no code implementations • 24 Jan 2024 • Hu Hu, Sabato Marco Siniscalchi, Chin-Hui Lee

In this work, we aim to establish a Bayesian adaptive learning framework by focusing on estimating latent variables in deep neural network (DNN) models.

Acoustic Scene Classification Scene Classification +1

Paper
Add Code

TeachCLIP: Multi-Grained Teaching for Efficient Text-to-Video Retrieval

no code implementations • 2 Aug 2023 • Kaibin Tian, Ruixiang Zhao, Hu Hu, Runquan Xie, Fengzong Lian, Zhanhui Kang, Xirong Li

For efficient T2VR, we propose TeachCLIP with multi-grained teaching to let a CLIP4Clip based student network learn from more advanced yet computationally heavy models such as X-CLIP, TS2-Net and X-Pool .

Retrieval text similarity +2

Paper
Add Code

A study on joint modeling and data augmentation of multi-modalities for audio-visual scene classification

no code implementations • 7 Mar 2022 • Qing Wang, Jun Du, Siyuan Zheng, Yunqing Li, Yajian Wang, Yuzhong Wu, Hu Hu, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Yannan Wang, Chin-Hui Lee

In this paper, we propose two techniques, namely joint modeling and data augmentation, to improve system performances for audio-visual scene classification (AVSC).

Data Augmentation Scene Classification

Paper
Add Code

A Variational Bayesian Approach to Learning Latent Variables for Acoustic Knowledge Transfer

1 code implementation • 16 Oct 2021 • Hu Hu, Sabato Marco Siniscalchi, Chao-Han Huck Yang, Chin-Hui Lee

We propose a variational Bayesian (VB) approach to learning distributions of latent variables in deep neural network (DNN) models for cross-domain knowledge transfer, to address acoustic mismatches between training and testing conditions.

Acoustic Scene Classification Scene Classification +1

Paper
Code

Neural Model Reprogramming with Similarity Based Mapping for Low-Resource Spoken Command Recognition

1 code implementation • 8 Oct 2021 • Hao Yen, Pin-Jui Ku, Chao-Han Huck Yang, Hu Hu, Sabato Marco Siniscalchi, Pin-Yu Chen, Yu Tsao

In this study, we propose a novel adversarial reprogramming (AR) approach for low-resource spoken command recognition (SCR), and build an AR-SCR system.

Spoken Command Recognition Transfer Learning

Paper
Code

A Lottery Ticket Hypothesis Framework for Low-Complexity Device-Robust Neural Acoustic Scene Classification

no code implementations • 3 Jul 2021 • Hao Yen, Chao-Han Huck Yang, Hu Hu, Sabato Marco Siniscalchi, Qing Wang, Yuyang Wang, Xianjun Xia, Yuanjun Zhao, Yuzhong Wu, Yannan Wang, Jun Du, Chin-Hui Lee

We propose a novel neural model compression strategy combining data augmentation, knowledge transfer, pruning, and quantization for device-robust acoustic scene classification (ASC).

Acoustic Scene Classification Data Augmentation +5

Paper
Add Code

REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling

no code implementations • 14 Dec 2020 • Hu Hu, Xuesong Yang, Zeynab Raeesy, Jinxi Guo, Gokce Keskin, Harish Arsikere, Ariya Rastrow, Andreas Stolcke, Roland Maas

Accents mismatching is a critical problem for end-to-end ASR.

Clustering

Paper
Add Code

A Two-Stage Approach to Device-Robust Acoustic Scene Classification

1 code implementation • 3 Nov 2020 • Hu Hu, Chao-Han Huck Yang, Xianjun Xia, Xue Bai, Xin Tang, Yajian Wang, Shutong Niu, Li Chai, Juanjuan Li, Hongning Zhu, Feng Bao, Yuanjun Zhao, Sabato Marco Siniscalchi, Yannan Wang, Jun Du, Chin-Hui Lee

To improve device robustness, a highly desirable key feature of a competitive data-driven acoustic scene classification (ASC) system, a novel two-stage system based on fully convolutional neural networks (CNNs) is proposed.

Ranked #1 on Acoustic Scene Classification on TAU Urban Acoustic Scenes 2019 (using extra training data)

Acoustic Scene Classification Classification +4

Paper
Code

Relational Teacher Student Learning with Neural Label Embedding for Device Adaptation in Acoustic Scene Classification

no code implementations • 31 Jul 2020 • Hu Hu, Sabato Marco Siniscalchi, Yannan Wang, Chin-Hui Lee

In this paper, we propose a domain adaptation framework to address the device mismatch issue in acoustic scene classification leveraging upon neural label embedding (NLE) and relational teacher student learning (RTSL).

Acoustic Scene Classification Classification +3

Paper
Add Code

An Acoustic Segment Model Based Segment Unit Selection Approach to Acoustic Scene Classification with Partial Utterances

no code implementations • 31 Jul 2020 • Hu Hu, Sabato Marco Siniscalchi, Yannan Wang, Xue Bai, Jun Du, Chin-Hui Lee

In contrast to building scene models with whole utterances, the ASM-removed sub-utterances, i. e., acoustic utterances without stop acoustic segments, are then used as inputs to the AlexNet-L back-end for final classification.

Acoustic Scene Classification Classification +5

Paper
Add Code

Exploring Deep Hybrid Tensor-to-Vector Network Architectures for Regression Based Speech Enhancement

2 code implementations • 25 Jul 2020 • Jun Qi, Hu Hu, Yannan Wang, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Chin-Hui Lee

Finally, our experiments of multi-channel speech enhancement on a simulated noisy WSJ0 corpus demonstrate that our proposed hybrid CNN-TT architecture achieves better results than both DNN and CNN models in terms of better-enhanced speech qualities and smaller parameter sizes.

regression Speech Enhancement

Paper
Code

Device-Robust Acoustic Scene Classification Based on Two-Stage Categorization and Data Augmentation

1 code implementation • 16 Jul 2020 • Hu Hu, Chao-Han Huck Yang, Xianjun Xia, Xue Bai, Xin Tang, Yajian Wang, Shutong Niu, Li Chai, Juanjuan Li, Hongning Zhu, Feng Bao, Yuanjun Zhao, Sabato Marco Siniscalchi, Yannan Wang, Jun Du, Chin-Hui Lee

On Task 1b development data set, we achieve an accuracy of 96. 7\% with a model size smaller than 500KB.

Acoustic Scene Classification Data Augmentation +3

Paper
Code

Exploring Pre-training with Alignments for RNN Transducer based End-to-End Speech Recognition

no code implementations • 1 May 2020 • Hu Hu, Rui Zhao, Jinyu Li, Liang Lu, Yifan Gong

Recently, the recurrent neural network transducer (RNN-T) architecture has become an emerging trend in end-to-end automatic speech recognition research due to its advantages of being capable for online streaming speech recognition.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

L-Vector: Neural Label Embedding for Domain Adaptation

no code implementations • 25 Apr 2020 • Zhong Meng, Hu Hu, Jinyu Li, Changliang Liu, Yan Huang, Yifan Gong, Chin-Hui Lee

We propose a novel neural label embedding (NLE) scheme for the domain adaptation of a deep neural network (DNN) acoustic model with unpaired data samples from source and target domains.

Domain Adaptation

Paper
Add Code

Tensor-to-Vector Regression for Multi-channel Speech Enhancement based on Tensor-Train Network

2 code implementations • 3 Feb 2020 • Jun Qi, Hu Hu, Yannan Wang, Chao-Han Huck Yang, Sabato Marco Siniscalchi, Chin-Hui Lee

Finally, in 8-channel conditions, a PESQ of 3. 12 is achieved using 20 million parameters for TTN, whereas a DNN with 68 million parameters can only attain a PESQ of 3. 06.

regression Speech Enhancement

Paper
Code

Improving RNN Transducer Modeling for End-to-End Speech Recognition

1 code implementation • 26 Sep 2019 • Jinyu Li, Rui Zhao, Hu Hu, Yifan Gong

In this paper, we improve the RNN-T training in two aspects.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.