Search Results for author: Xuesong Yang

Found 12 papers, 4 papers with code

Landmark-based consonant voicing detection on multilingual corpora

no code implementations • 10 Nov 2016 • Xiang Kong, Xuesong Yang, Mark Hasegawa-Johnson, Jeung-Yoon Choi, Stefanie Shattuck-Hufnagel

Three consonant voicing classifiers were developed: (1) manually selected acoustic features anchored at a phonetic landmark, (2) MFCCs (either averaged across the segment or anchored at the landmark), and(3) acoustic features computed using a convolutional neural network (CNN).

Paper
Add Code

End-to-End Joint Learning of Natural Language Understanding and Dialogue Manager

1 code implementation • 3 Dec 2016 • Xuesong Yang, Yun-Nung Chen, Dilek Hakkani-Tur, Paul Crook, Xiujun Li, Jianfeng Gao, Li Deng

Natural language understanding and dialogue policy learning are both essential in conversational systems that predict the next system actions in response to a current user utterance.

Natural Language Understanding

Paper
Code

Joint Modeling of Accents and Acoustics for Multi-Accent Speech Recognition

no code implementations • 7 Feb 2018 • Xuesong Yang, Kartik Audhkhasi, Andrew Rosenberg, Samuel Thomas, Bhuvana Ramabhadran, Mark Hasegawa-Johnson

The performance of automatic speech recognition systems degrades with increasing mismatch between the training and testing scenarios.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Deep Learning Based Speech Beamforming

no code implementations • 15 Feb 2018 • Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Dinei Florencio, Mark Hasegawa-Johnson

On the other hand, deep learning based enhancement approaches are able to learn complicated speech distributions and perform efficient inference, but they are unable to deal with variable number of input channels.

Speech Enhancement

Paper
Add Code

Improved ASR for Under-Resourced Languages Through Multi-Task Learning with Acoustic Landmarks

no code implementations • 15 May 2018 • Di He, Boon Pang Lim, Xuesong Yang, Mark Hasegawa-Johnson, Deming Chen

Furui first demonstrated that the identity of both consonant and vowel can be perceived from the C-V transition; later, Stevens proposed that acoustic landmarks are the primary cues for speech perception, and that steady-state regions are secondary or supplemental.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +2

Paper
Add Code

When CTC Training Meets Acoustic Landmarks

no code implementations • 5 Nov 2018 • Di He, Xuesong Yang, Boon Pang Lim, Yi Liang, Mark Hasegawa-Johnson, Deming Chen

In this paper, the convergence properties of CTC are improved by incorporating acoustic landmarks.

Automatic Speech Recognition (ASR)

Paper
Add Code

AUTOVC: Zero-Shot Voice Style Transfer with Only Autoencoder Loss

11 code implementations • 14 May 2019 • Kaizhi Qian, Yang Zhang, Shiyu Chang, Xuesong Yang, Mark Hasegawa-Johnson

On the other hand, CVAE training is simple but does not come with the distribution-matching property of a GAN.

Style Transfer Voice Conversion

962

Paper
Code

REDAT: Accent-Invariant Representation for End-to-End ASR by Domain Adversarial Training with Relabeling

no code implementations • 14 Dec 2020 • Hu Hu, Xuesong Yang, Zeynab Raeesy, Jinxi Guo, Gokce Keskin, Harish Arsikere, Ariya Rastrow, Andreas Stolcke, Roland Maas

Accents mismatching is a critical problem for end-to-end ASR.

Clustering

Paper
Add Code

Trip-ROMA: Self-Supervised Learning with Triplets and Random Mappings

1 code implementation • 22 Jul 2021 • Wenbin Li, Xuesong Yang, Meihao Kong, Lei Wang, Jing Huo, Yang Gao, Jiebo Luo

However, in small data regimes, we can not obtain a sufficient number of negative pairs or effectively avoid the over-fitting problem when negatives are not used at all.

Representation Learning Self-Supervised Learning +1

Paper
Code

LibFewShot: A Comprehensive Library for Few-shot Learning

1 code implementation • 10 Sep 2021 • Wenbin Li, Ziyi, Wang, Xuesong Yang, Chuanqi Dong, Pinzhuo Tian, Tiexin Qin, Jing Huo, Yinghuan Shi, Lei Wang, Yang Gao, Jiebo Luo

Furthermore, based on LibFewShot, we provide comprehensive evaluations on multiple benchmarks with various backbone architectures to evaluate common pitfalls and effects of different training tricks.

Data Augmentation Few-Shot Image Classification +2

804

Paper
Code

NDGGNET-A Node Independent Gate based Graph Neural Networks

no code implementations • 11 May 2022 • Ye Tang, Xuesong Yang, Xinrui Liu, Xiwei Zhao, Zhangang Lin, Changping Peng

Graph Neural Networks (GNNs) is an architecture for structural data, and has been adopted in a mass of tasks and achieved fabulous results, such as link prediction, node classification, graph classification and so on.

Graph Classification Link Prediction +1

Paper
Add Code

A Unified Framework for Contrastive Learning from a Perspective of Affinity Matrix

no code implementations • 26 Nov 2022 • Wenbin Li, Meihao Kong, Xuesong Yang, Lei Wang, Jing Huo, Yang Gao, Jiebo Luo

In this study, we present a new unified contrastive learning representation framework (named UniCLR) suitable for all the above four kinds of methods from a novel perspective of basic affinity matrix.

Contrastive Learning Representation Learning

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.