no code implementations • 6 Jul 2023 • Gene-Ping Yang, Yue Gu, Qingming Tang, Dongsu Du, Yuzong Liu
Our approach used a teacher-student framework to transfer knowledge from a larger, more complex model to a smaller, light-weight model using dual-view cross-correlation distillation and the teacher's codebook as learning objectives.
no code implementations • 7 Mar 2023 • Chenyang Gao, Yue Gu, Francesco Caliva, Yuzong Liu
Self-supervised speech representation learning (S3RL) is revolutionizing the way we leverage the ever-growing availability of data.
no code implementations • 20 Oct 2021 • Chenyang Gao, Yue Gu, Ivan Marsic
We investigate the use of the mapping-based method in the time domain and show that it can perform better on a large training set than the masking-based method.
no code implementations • 14 Jun 2020 • Yue Gu, Wenxi Liu
Besides, leveraging our density map generation method, we propose an iterative distillation algorithm to progressively enhance our model with identical network structures, without significantly sacrificing the dimension of the output density maps.
no code implementations • 20 Jun 2019 • Yue Gu, Zhihao Du, HUI ZHANG, Xueliang Zhang
To improve the robustness, a speech enhancement front-end is involved.
2 code implementations • 15 Apr 2019 • Jalal Abdulbaqi, Yue Gu, Ivan Marsic
Most current speech enhancement models use spectrogram features that require an expensive transformation and result in phase information loss.
no code implementations • COLING 2018 • Yue Gu, Kangning Yang, Shiyu Fu, Shuhong Chen, Xinyu Li, Ivan Marsic
The proposed hybrid attention architecture helps the system focus on learning informative representations for both modality-specific feature extraction and model fusion.
no code implementations • ACL 2018 • Yue Gu, Kangning Yang, Shiyu Fu, Shuhong Chen, Xinyu Li, Ivan Marsic
Multimodal affective computing, learning to recognize and interpret human affects and subjective information from multiple data sources, is still challenging because: (i) it is hard to extract informative features to represent human affects from heterogeneous inputs; (ii) current fusion strategies only fuse different modalities at abstract level, ignoring time-dependent interactions between modalities.
no code implementations • 22 Feb 2018 • Yue Gu, Shuhong Chen, Ivan Marsic
In this paper, we present a novel deep multimodal framework to predict human emotions based on sentence-level spoken language.
no code implementations • NeurIPS 2017 • Roderich Gross, Yue Gu, Wei Li, Melvin Gauci
In this paper we examine how these algorithms relate to the Turing test, and derive what - from a Turing perspective - can be considered their defining features.
no code implementations • 28 Feb 2017 • Xinyu Li, Yanyi Zhang, Jianyu Zhang, Yueyang Chen, Shuhong Chen, Yue Gu, Moliang Zhou, Richard A. Farneth, Ivan Marsic, Randall S. Burd
For the Olympic swimming dataset, our system achieved an accuracy of 88%, an F1-score of 0. 58, a completeness estimation error of 6. 3% and a remaining-time estimation error of 2. 9 minutes.