Search Results for author: Xuesong Yang

Found 10 papers, 3 papers with code

LibFewShot: A Comprehensive Library for Few-shot Learning

1 code implementation10 Sep 2021 Wenbin Li, Chuanqi Dong, Pinzhuo Tian, Tiexin Qin, Xuesong Yang, Ziyi Wang, Jing Huo, Yinghuan Shi, Lei Wang, Yang Gao, Jiebo Luo

Furthermore, based on LibFewShot, we provide comprehensive evaluations on multiple benchmark datasets with multiple backbone architectures to evaluate common pitfalls and effects of different training tricks.

Data Augmentation Few-Shot Image Classification +1

Triplet is All You Need with Random Mappings for Unsupervised Visual Representation Learning

no code implementations22 Jul 2021 Wenbin Li, Xuesong Yang, Meihao Kong, Lei Wang, Jing Huo, Yang Gao, Jiebo Luo

However, this type of methods, such as SimCLR and MoCo, relies heavily on a large number of negative pairs and thus requires either large batches or memory banks.

Representation Learning Self-Supervised Learning

AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

7 code implementations14 May 2019 Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Mark Hasegawa-Johnson

On the other hand, CVAE training is simple but does not come with the distribution-matching property of a GAN.

Style Transfer Voice Conversion

When CTC Training Meets Acoustic Landmarks

no code implementations5 Nov 2018 Di He, Xuesong Yang, Boon Pang Lim, Yi Liang, Mark Hasegawa-Johnson, Deming Chen

In this paper, the convergence properties of CTC are improved by incorporating acoustic landmarks.

Speech Recognition

Improved ASR for Under-Resourced Languages Through Multi-Task Learning with Acoustic Landmarks

no code implementations15 May 2018 Di He, Boon Pang Lim, Xuesong Yang, Mark Hasegawa-Johnson, Deming Chen

Furui first demonstrated that the identity of both consonant and vowel can be perceived from the C-V transition; later, Stevens proposed that acoustic landmarks are the primary cues for speech perception, and that steady-state regions are secondary or supplemental.

automatic-speech-recognition Multi-Task Learning +1

Deep Learning Based Speech Beamforming

no code implementations15 Feb 2018 Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Dinei Florencio, Mark Hasegawa-Johnson

On the other hand, deep learning based enhancement approaches are able to learn complicated speech distributions and perform efficient inference, but they are unable to deal with variable number of input channels.

Speech Enhancement

End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager

1 code implementation3 Dec 2016 Xuesong Yang, Yun-Nung Chen, Dilek Hakkani-Tur, Paul Crook, Xiujun Li, Jianfeng Gao, Li Deng

Natural language understanding and dialogue policy learning are both essential in conversational systems that predict the next system actions in response to a current user utterance.

Natural Language Understanding

Landmark-based consonant voicing detection on multilingual corpora

no code implementations10 Nov 2016 Xiang Kong, Xuesong Yang, Mark Hasegawa-Johnson, Jeung-Yoon Choi, Stefanie Shattuck-Hufnagel

Three consonant voicing classifiers were developed: (1) manually selected acoustic features anchored at a phonetic landmark, (2) MFCCs (either averaged across the segment or anchored at the landmark), and(3) acoustic features computed using a convolutional neural network (CNN).

Cannot find the paper you are looking for? You can Submit a new open access paper.