Search Results for author: Junyi Ao

Found 11 papers, 5 papers with code

Multi-View Self-Attention Based Transformer for Speaker Recognition

no code implementations • 11 Oct 2021 • Rui Wang, Junyi Ao, Long Zhou, Shujie Liu, Zhihua Wei, Tom Ko, Qing Li, Yu Zhang

In this work, we propose a novel multi-view self-attention mechanism and present an empirical study of different Transformer variants with or without the proposed attention mechanism for speaker recognition.

Speaker Recognition

Paper
Add Code

SpeechT5: Unified-Modal Encoder-Decoder Pre-Training for Spoken Language Processing

3 code implementations • ACL 2022 • Junyi Ao, Rui Wang, Long Zhou, Chengyi Wang, Shuo Ren, Yu Wu, Shujie Liu, Tom Ko, Qing Li, Yu Zhang, Zhihua Wei, Yao Qian, Jinyu Li, Furu Wei

Motivated by the success of T5 (Text-To-Text Transfer Transformer) in pre-trained natural language processing models, we propose a unified-modal SpeechT5 framework that explores the encoder-decoder pre-training for self-supervised speech/text representation learning.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +7

124,593

Paper
Code

LightHuBERT: Lightweight and Configurable Speech Representation Learning with Once-for-All Hidden-Unit BERT

1 code implementation • 29 Mar 2022 • Rui Wang, Qibing Bai, Junyi Ao, Long Zhou, Zhixiang Xiong, Zhihua Wei, Yu Zhang, Tom Ko, Haizhou Li

LightHuBERT outperforms the original HuBERT on ASR and five SUPERB tasks with the HuBERT size, achieves comparable performance to the teacher model in most tasks with a reduction of 29% parameters, and obtains a $3. 5\times$ compression ratio in three SUPERB tasks, e. g., automatic speaker verification, keyword spotting, and intent classification, with a slight accuracy loss.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +6

Paper
Code

Pre-Training Transformer Decoder for End-to-End ASR Model with Unpaired Speech Data

1 code implementation • 31 Mar 2022 • Junyi Ao, Ziqiang Zhang, Long Zhou, Shujie Liu, Haizhou Li, Tom Ko, LiRong Dai, Jinyu Li, Yao Qian, Furu Wei

In this way, the decoder learns to reconstruct original speech information with codes before learning to generate correct text.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

1,008

Paper
Code

The YiTrans End-to-End Speech Translation System for IWSLT 2022 Offline Shared Task

1 code implementation • 12 Jun 2022 • Ziqiang Zhang, Junyi Ao, Long Zhou, Shujie Liu, Furu Wei, Jinyu Li

The YiTrans system is built on large-scale pre-trained encoder-decoder models.

Data Augmentation Translation

1,008

Paper
Code

SpeechUT: Bridging Speech and Text with Hidden-Unit for Encoder-Decoder Based Speech-Text Pre-training

1 code implementation • 7 Oct 2022 • Ziqiang Zhang, Long Zhou, Junyi Ao, Shujie Liu, LiRong Dai, Jinyu Li, Furu Wei

The rapid development of single-modal pre-training has prompted researchers to pay more attention to cross-modal pre-training methods.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

1,008

Paper
Code

token2vec: A Joint Self-Supervised Pre-training Framework Using Unpaired Speech and Text

no code implementations • 30 Oct 2022 • Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li

Firstly, due to the distinct characteristics between speech and text modalities, where speech is continuous while text is discrete, we first discretize speech into a sequence of discrete speech tokens to solve the modality mismatch problem.

intent-classification Intent Classification +1

Paper
Add Code

Self-Supervised Acoustic Word Embedding Learning via Correspondence Transformer Encoder

no code implementations • 19 Jul 2023 • Jingru Lin, Xianghu Yue, Junyi Ao, Haizhou Li

We train the model based on the idea that different realisations of the same word should be close in the underlying embedding space.

Word Embeddings

Paper
Add Code

The NUS-HLT System for ICASSP2024 ICMC-ASR Grand Challenge

no code implementations • 26 Dec 2023 • Meng Ge, Yizhou Peng, Yidi Jiang, Jingru Lin, Junyi Ao, Mehmet Sinan Yildirim, Shuai Wang, Haizhou Li, Mengling Feng

This paper summarizes our team's efforts in both tracks of the ICMC-ASR Challenge for in-car multi-channel automatic speech recognition.

Automatic Speech Recognition Data Augmentation +2

Paper
Add Code

Text-guided HuBERT: Self-Supervised Speech Pre-training via Generative Adversarial Networks

no code implementations • 24 Feb 2024 • Duo Ma, Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li

In this paper, we investigate a new way to pre-train such a joint speech-text model to learn enhanced speech representations and benefit various speech-related downstream tasks.

Pseudo Label Self-Supervised Learning

Paper
Add Code

The YiTrans Speech Translation System for IWSLT 2022 Offline Shared Task

no code implementations • IWSLT (ACL) 2022 • Ziqiang Zhang, Junyi Ao

This paper describes the submission of our end-to-end YiTrans speech translation system for the IWSLT 2022 offline task, which translates from English audio to German, Chinese, and Japanese.

Data Augmentation Translation

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.