Search Results for author: Yangze Li

Found 5 papers, 1 papers with code

Freeze-Omni: A Smart and Low Latency Speech-to-speech Dialogue Model with Frozen LLM

no code implementations1 Nov 2024 Xiong Wang, Yangze Li, Chaoyou Fu, Yunhang Shen, Lei Xie, Ke Li, Xing Sun, Long Ma

Our main contribution is that the speech input and output modalities can be easily connected to a textual LLM while keeping the LLM's parameters frozen throughout the training process.

MMGER: Multi-modal and Multi-granularity Generative Error Correction with LLM for Joint Accent and Speech Recognition

no code implementations6 May 2024 Bingshen Mu, Yangze Li, Qijie Shao, Kun Wei, Xucheng Wan, Naijun Zheng, Huan Zhou, Lei Xie

Accents represent deviations from standard pronunciation norms, and the multi-task learning framework for simultaneous ASR and accent recognition (AR) has effectively addressed the multi-accent scenarios, making it a prominent solution.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +4

BA-SOT: Boundary-Aware Serialized Output Training for Multi-Talker ASR

no code implementations23 May 2023 Yuhao Liang, Fan Yu, Yangze Li, Pengcheng Guo, Shiliang Zhang, Qian Chen, Lei Xie

The recently proposed serialized output training (SOT) simplifies multi-talker automatic speech recognition (ASR) by generating speaker transcriptions separated by a special token.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +3

CASA-ASR: Context-Aware Speaker-Attributed ASR

no code implementations21 May 2023 Mohan Shi, Zhihao Du, Qian Chen, Fan Yu, Yangze Li, Shiliang Zhang, Jie Zhang, Li-Rong Dai

In addition, a two-pass decoding strategy is further proposed to fully leverage the contextual modeling ability resulting in a better recognition performance.

Automatic Speech Recognition speech-recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.