Search Results for author: Cal Peyser

Found 8 papers, 0 papers with code

E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR

no code implementations22 Apr 2022 W. Ronny Huang, Shuo-Yiin Chang, David Rybach, Rohit Prabhavalkar, Tara N. Sainath, Cyril Allauzen, Cal Peyser, Zhiyun Lu

Improving the performance of end-to-end ASR models on long utterances ranging from minutes to hours in length is an ongoing challenge in speech recognition.

speech-recognition Speech Recognition

Improving Rare Word Recognition with LM-aware MWER Training

no code implementations15 Apr 2022 Weiran Wang, Tongzhou Chen, Tara N. Sainath, Ehsan Variani, Rohit Prabhavalkar, Ronny Huang, Bhuvana Ramabhadran, Neeraj Gaur, Sepand Mavandadi, Cal Peyser, Trevor Strohman, Yanzhang He, David Rybach

Language models (LMs) significantly improve the recognition accuracy of end-to-end (E2E) models on words rarely seen during training, when used in either the shallow fusion or the rescoring setups.

Lookup-Table Recurrent Language Models for Long Tail Speech Recognition

no code implementations9 Apr 2021 W. Ronny Huang, Tara N. Sainath, Cal Peyser, Shankar Kumar, David Rybach, Trevor Strohman

We introduce Lookup-Table Language Models (LookupLM), a method for scaling up the size of RNN language models with only a constant increase in the floating point operations, by increasing the expressivity of the embedding table.

speech-recognition Speech Recognition

Improving Tail Performance of a Deliberation E2E ASR Model Using a Large Text Corpus

no code implementations24 Aug 2020 Cal Peyser, Sepand Mavandadi, Tara N. Sainath, James Apfel, Ruoming Pang, Shankar Kumar

End-to-end (E2E) automatic speech recognition (ASR) systems lack the distinct language model (LM) component that characterizes traditional speech systems.

Automatic Speech Recognition speech-recognition

Improving Proper Noun Recognition in End-to-End ASR By Customization of the MWER Loss Criterion

no code implementations19 May 2020 Cal Peyser, Tara N. Sainath, Golan Pundak

Proper nouns present a challenge for end-to-end (E2E) automatic speech recognition (ASR) systems in that a particular name may appear only rarely during training, and may have a pronunciation similar to that of a more common word.

Automatic Speech Recognition speech-recognition

A Streaming On-Device End-to-End Model Surpassing Server-Side Conventional Model Quality and Latency

no code implementations28 Mar 2020 Tara N. Sainath, Yanzhang He, Bo Li, Arun Narayanan, Ruoming Pang, Antoine Bruguier, Shuo-Yiin Chang, Wei Li, Raziel Alvarez, Zhifeng Chen, Chung-Cheng Chiu, David Garcia, Alex Gruenstein, Ke Hu, Minho Jin, Anjuli Kannan, Qiao Liang, Ian McGraw, Cal Peyser, Rohit Prabhavalkar, Golan Pundak, David Rybach, Yuan Shangguan, Yash Sheth, Trevor Strohman, Mirko Visontai, Yonghui Wu, Yu Zhang, Ding Zhao

Thus far, end-to-end (E2E) models have not been shown to outperform state-of-the-art conventional models with respect to both quality, i. e., word error rate (WER), and latency, i. e., the time the hypothesis is finalized after the user stops speaking.

Improving Performance of End-to-End ASR on Numeric Sequences

no code implementations1 Jul 2019 Cal Peyser, Hao Zhang, Tara N. Sainath, Zelin Wu

This out-of-vocabulary (OOV) issue is addressed in conventional ASR systems by training part of the model on spoken domain utterances (e. g.

speech-recognition Speech Recognition

Cannot find the paper you are looking for? You can Submit a new open access paper.