Search Results for author: Anshuman Tripathi

Found 7 papers, 2 papers with code

Contrastive Siamese Network for Semi-supervised Speech Recognition

no code implementations • 27 May 2022 • Soheil Khorram, Jaeyoung Kim, Anshuman Tripathi, Han Lu, Qian Zhang, Hasim Sak

This paper introduces contrastive siamese (c-siam) network, an architecture for leveraging unlabeled acoustic data in speech recognition.

speech-recognition Speech Recognition

Paper
Add Code

Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection

1 code implementation • 23 Sep 2021 • Wei Xia, Han Lu, Quan Wang, Anshuman Tripathi, Yiling Huang, Ignacio Lopez Moreno, Hasim Sak

In this paper, we present a novel speaker diarization system for streaming on-device applications.

Clustering speaker-diarization +1

490

Paper
Code

Reducing Streaming ASR Model Delay with Self Alignment

no code implementations • 6 May 2021 • Jaeyoung Kim, Han Lu, Anshuman Tripathi, Qian Zhang, Hasim Sak

From LibriSpeech evaluation, self alignment outperformed existing schemes: 25% and 56% less delay compared to FastEmit and constrained alignment at the similar word error rate.

Paper
Add Code

Transformer Transducer: One Model Unifying Streaming and Non-streaming Speech Recognition

no code implementations • 7 Oct 2020 • Anshuman Tripathi, Jaeyoung Kim, Qian Zhang, Han Lu, Hasim Sak

In this paper we present a Transformer-Transducer model architecture and a training technique to unify streaming and non-streaming speech recognition models into one model.

speech-recognition Speech Recognition

Paper
Add Code

Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss

5 code implementations • 7 Feb 2020 • Qian Zhang, Han Lu, Hasim Sak, Anshuman Tripathi, Erik McDermott, Stephen Koo, Shankar Kumar

We present results on the LibriSpeech dataset showing that limiting the left context for self-attention in the Transformer layers makes decoding computationally tractable for streaming, with only a slight degradation in accuracy.

speech-recognition Speech Recognition

Paper
Code

Toward domain-invariant speech recognition via large scale training

no code implementations • 16 Aug 2018 • Arun Narayanan, Ananya Misra, Khe Chai Sim, Golan Pundak, Anshuman Tripathi, Mohamed Elfeky, Parisa Haghani, Trevor Strohman, Michiel Bacchiani

More importantly, such models generalize better to unseen conditions and allow for rapid adaptation -- we show that by using as little as 10 hours of data from a new domain, an adapted domain-invariant model can match performance of a domain-specific model trained from scratch using 70 times as much data.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Speech recognition for medical conversations

no code implementations • 20 Nov 2017 • Chung-Cheng Chiu, Anshuman Tripathi, Katherine Chou, Chris Co, Navdeep Jaitly, Diana Jaunzeikare, Anjuli Kannan, Patrick Nguyen, Hasim Sak, Ananth Sankar, Justin Tansuwan, Nathan Wan, Yonghui Wu, Xuedong Zhang

We explored both CTC and LAS systems for building speech recognition models.

Automatic Speech Recognition Automatic Speech Recognition (ASR) +1

Paper
Add Code

Cannot find the paper you are looking for? You can Submit a new open access paper.