Search Results for author: Yue Gu

Found 12 papers, 1 papers with code

Improving Label Assignments Learning by Dynamic Sample Dropout Combined with Layer-wise Optimization in Speech Separation

no code implementations20 Nov 2023 Chenyang Gao, Yue Gu, Ivan Marsic

Despite its success, previous studies showed that PIT is plagued by excessive label assignment switching in adjacent epochs, impeding the model to learn better label assignments.

Speech Separation

On-Device Constrained Self-Supervised Speech Representation Learning for Keyword Spotting via Knowledge Distillation

no code implementations6 Jul 2023 Gene-Ping Yang, Yue Gu, Qingming Tang, Dongsu Du, Yuzong Liu

Our approach used a teacher-student framework to transfer knowledge from a larger, more complex model to a smaller, light-weight model using dual-view cross-correlation distillation and the teacher's codebook as learning objectives.

Keyword Spotting Knowledge Distillation +1

Self-supervised speech representation learning for keyword-spotting with light-weight transformers

no code implementations7 Mar 2023 Chenyang Gao, Yue Gu, Francesco Caliva, Yuzong Liu

Self-supervised speech representation learning (S3RL) is revolutionizing the way we leverage the ever-growing availability of data.

Keyword Spotting Representation Learning

Progressive Learning for Stabilizing Label Selection in Speech Separation with Mapping-based Method

no code implementations20 Oct 2021 Chenyang Gao, Yue Gu, Ivan Marsic

We investigate the use of the mapping-based method in the time domain and show that it can perform better on a large training set than the masking-based method.

Speech Recognition Speech Separation

Recurrent Distillation based Crowd Counting

no code implementations14 Jun 2020 Yue Gu, Wenxi Liu

Besides, leveraging our density map generation method, we propose an iterative distillation algorithm to progressively enhance our model with identical network structures, without significantly sacrificing the dimension of the output density maps.

Crowd Counting

RHR-Net: A Residual Hourglass Recurrent Neural Network for Speech Enhancement

2 code implementations15 Apr 2019 Jalal Abdulbaqi, Yue Gu, Ivan Marsic

Most current speech enhancement models use spectrogram features that require an expensive transformation and result in phase information loss.

Speech Enhancement

Hybrid Attention based Multimodal Network for Spoken Language Classification

no code implementations COLING 2018 Yue Gu, Kangning Yang, Shiyu Fu, Shuhong Chen, Xinyu Li, Ivan Marsic

The proposed hybrid attention architecture helps the system focus on learning informative representations for both modality-specific feature extraction and model fusion.

Classification Emotion Recognition +4

Multimodal Affective Analysis Using Hierarchical Attention Strategy with Word-Level Alignment

no code implementations ACL 2018 Yue Gu, Kangning Yang, Shiyu Fu, Shuhong Chen, Xinyu Li, Ivan Marsic

Multimodal affective computing, learning to recognize and interpret human affects and subjective information from multiple data sources, is still challenging because: (i) it is hard to extract informative features to represent human affects from heterogeneous inputs; (ii) current fusion strategies only fuse different modalities at abstract level, ignoring time-dependent interactions between modalities.

Deep Multimodal Learning for Emotion Recognition in Spoken Language

no code implementations22 Feb 2018 Yue Gu, Shuhong Chen, Ivan Marsic

In this paper, we present a novel deep multimodal framework to predict human emotions based on sentence-level spoken language.

Emotion Recognition Sentence

Generalizing GANs: A Turing Perspective

no code implementations NeurIPS 2017 Roderich Gross, Yue Gu, Wei Li, Melvin Gauci

In this paper we examine how these algorithms relate to the Turing test, and derive what - from a Turing perspective - can be considered their defining features.

Progress Estimation and Phase Detection for Sequential Processes

no code implementations28 Feb 2017 Xinyu Li, Yanyi Zhang, Jianyu Zhang, Yueyang Chen, Shuhong Chen, Yue Gu, Moliang Zhou, Richard A. Farneth, Ivan Marsic, Randall S. Burd

For the Olympic swimming dataset, our system achieved an accuracy of 88%, an F1-score of 0. 58, a completeness estimation error of 6. 3% and a remaining-time estimation error of 2. 9 minutes.

Activity Recognition Multimodal Deep Learning

Cannot find the paper you are looking for? You can Submit a new open access paper.