Search Results for author: Fuping Pan

Found 7 papers, 3 papers with code

The GUA-Speech System Description for CNVSRC Challenge 2023

no code implementations12 Dec 2023 Shengqiang Li, Chao Lei, Baozhong Ma, BinBin Zhang, Fuping Pan

This study describes our system for Task 1 Single-speaker Visual Speech Recognition (VSR) fixed track in the Chinese Continuous Visual Speech Recognition Challenge (CNVSRC) 2023.

Language Modelling speech-recognition +1

ZeroPrompt: Streaming Acoustic Encoders are Zero-Shot Masked LMs

no code implementations18 May 2023 Xingchen Song, Di wu, BinBin Zhang, Zhendong Peng, Bo Dang, Fuping Pan, Zhiyong Wu

In this paper, we present ZeroPrompt (Figure 1-(a)) and the corresponding Prompt-and-Refine strategy (Figure 3), two simple but effective \textbf{training-free} methods to decrease the Token Display Time (TDT) of streaming ASR models \textbf{without any accuracy loss}.

TrimTail: Low-Latency Streaming ASR with Simple but Effective Spectrogram-Level Length Penalty

1 code implementation1 Nov 2022 Xingchen Song, Di wu, Zhiyong Wu, BinBin Zhang, Yuekai Zhang, Zhendong Peng, Wenpeng Li, Fuping Pan, Changbao Zhu

In this paper, we present TrimTail, a simple but effective emission regularization method to improve the latency of streaming ASR models.

WeKws: A production first small-footprint end-to-end Keyword Spotting Toolkit

1 code implementation30 Oct 2022 Jie Wang, Menglong Xu, Jingyong Hou, BinBin Zhang, Xiao-Lei Zhang, Lei Xie, Fuping Pan

Keyword spotting (KWS) enables speech-based user interaction and gradually becomes an indispensable component of smart devices.

Keyword Spotting

WeNet 2.0: More Productive End-to-End Speech Recognition Toolkit

3 code implementations29 Mar 2022 BinBin Zhang, Di wu, Zhendong Peng, Xingchen Song, Zhuoyuan Yao, Hang Lv, Lei Xie, Chao Yang, Fuping Pan, Jianwei Niu

Recently, we made available WeNet, a production-oriented end-to-end speech recognition toolkit, which introduces a unified two-pass (U2) framework and a built-in runtime to address the streaming and non-streaming decoding modes in a single model.

Language Modelling speech-recognition +1

Cannot find the paper you are looking for? You can Submit a new open access paper.