Search Results for author: Shankar Kumar

Found 22 papers, 4 papers with code

Data Strategies for Low-Resource Grammatical Error Correction

no code implementations EACL (BEA) 2021 Simon Flachs, Felix Stahlberg, Shankar Kumar

We investigate how best to take advantage of existing data sources for improving GEC systems for languages with limited quantities of high quality training data.

Grammatical Error Correction

Jam or Cream First? Modeling Ambiguity in Neural Machine Translation with SCONES

no code implementations2 May 2022 Felix Stahlberg, Shankar Kumar

The softmax layer in neural machine translation is designed to model the distribution over mutually exclusive tokens.

Machine Translation Multi-Label Classification +1

Sentence-Select: Large-Scale Language Model Data Selection for Rare-Word Speech Recognition

no code implementations9 Mar 2022 W. Ronny Huang, Cal Peyser, Tara N. Sainath, Ruoming Pang, Trevor Strohman, Shankar Kumar

We down-select a large corpus of web search queries by a factor of 53x and achieve better LM perplexities than without down-selection.

Speech Recognition

Transformer-based Models of Text Normalization for Speech Applications

no code implementations1 Feb 2022 Jae Hun Ro, Felix Stahlberg, Ke wu, Shankar Kumar

Text normalization, or the process of transforming text into a consistent, canonical form, is crucial for speech applications such as text-to-speech synthesis (TTS).

Speech Synthesis Text-To-Speech Synthesis

Position-Invariant Truecasing with a Word-and-Character Hierarchical Recurrent Neural Network

no code implementations26 Aug 2021 Hao Zhang, You-Chi Cheng, Shankar Kumar, Mingqing Chen, Rajiv Mathews

Truecasing is the task of restoring the correct case (uppercase or lowercase) of noisy text generated either by an automatic system for speech recognition or machine translation or by humans.

Machine Translation Named Entity Recognition +2

Synthetic Data Generation for Grammatical Error Correction with Tagged Corruption Models

1 code implementation EACL (BEA) 2021 Felix Stahlberg, Shankar Kumar

Synthetic data generation is widely known to boost the accuracy of neural grammatical error correction (GEC) systems, but existing methods often lack diversity or are too simplistic to generate the broad range of grammatical errors made by human writers.

14 Grammatical Error Correction +2

Lookup-Table Recurrent Language Models for Long Tail Speech Recognition

no code implementations9 Apr 2021 W. Ronny Huang, Tara N. Sainath, Cal Peyser, Shankar Kumar, David Rybach, Trevor Strohman

We introduce Lookup-Table Language Models (LookupLM), a method for scaling up the size of RNN language models with only a constant increase in the floating point operations, by increasing the expressivity of the embedding table.

Speech Recognition

Seq2Edits: Sequence Transduction Using Span-level Edit Operations

1 code implementation EMNLP 2020 Felix Stahlberg, Shankar Kumar

For text normalization, sentence fusion, and grammatical error correction, our approach improves explainability by associating each edit operation with a human-readable tag.

Grammatical Error Correction Sentence Fusion +2

Improving Tail Performance of a Deliberation E2E ASR Model Using a Large Text Corpus

no code implementations24 Aug 2020 Cal Peyser, Sepand Mavandadi, Tara N. Sainath, James Apfel, Ruoming Pang, Shankar Kumar

End-to-end (E2E) automatic speech recognition (ASR) systems lack the distinct language model (LM) component that characterizes traditional speech systems.

Automatic Speech Recognition

Data Weighted Training Strategies for Grammatical Error Correction

no code implementations7 Aug 2020 Jared Lichtarge, Chris Alberti, Shankar Kumar

Recent progress in the task of Grammatical Error Correction (GEC) has been driven by addressing data sparsity, both through new methods for generating large and noisy pretraining data and through the publication of small and higher-quality finetuning data in the BEA-2019 shared task.

Grammatical Error Correction Machine Translation +1

Transformer Transducer: A Streamable Speech Recognition Model with Transformer Encoders and RNN-T Loss

3 code implementations7 Feb 2020 Qian Zhang, Han Lu, Hasim Sak, Anshuman Tripathi, Erik McDermott, Stephen Koo, Shankar Kumar

We present results on the LibriSpeech dataset showing that limiting the left context for self-attention in the Transformer layers makes decoding computationally tractable for streaming, with only a slight degradation in accuracy.

Frame Speech Recognition

Neural Language Modeling with Visual Features

no code implementations7 Mar 2019 Antonios Anastasopoulos, Shankar Kumar, Hank Liao

We report analysis that provides insights into why our multimodal language model improves upon a standard RNN language model.

Language Modelling

Lingvo: a Modular and Scalable Framework for Sequence-to-Sequence Modeling

3 code implementations21 Feb 2019 Jonathan Shen, Patrick Nguyen, Yonghui Wu, Zhifeng Chen, Mia X. Chen, Ye Jia, Anjuli Kannan, Tara Sainath, Yuan Cao, Chung-Cheng Chiu, Yanzhang He, Jan Chorowski, Smit Hinsu, Stella Laurenzo, James Qin, Orhan Firat, Wolfgang Macherey, Suyog Gupta, Ankur Bapna, Shuyuan Zhang, Ruoming Pang, Ron J. Weiss, Rohit Prabhavalkar, Qiao Liang, Benoit Jacob, Bowen Liang, HyoukJoong Lee, Ciprian Chelba, Sébastien Jean, Bo Li, Melvin Johnson, Rohan Anil, Rajat Tibrewal, Xiaobing Liu, Akiko Eriguchi, Navdeep Jaitly, Naveen Ari, Colin Cherry, Parisa Haghani, Otavio Good, Youlong Cheng, Raziel Alvarez, Isaac Caswell, Wei-Ning Hsu, Zongheng Yang, Kuan-Chieh Wang, Ekaterina Gonina, Katrin Tomanek, Ben Vanik, Zelin Wu, Llion Jones, Mike Schuster, Yanping Huang, Dehao Chen, Kazuki Irie, George Foster, John Richardson, Klaus Macherey, Antoine Bruguier, Heiga Zen, Colin Raffel, Shankar Kumar, Kanishka Rao, David Rybach, Matthew Murray, Vijayaditya Peddinti, Maxim Krikun, Michiel A. U. Bacchiani, Thomas B. Jablin, Rob Suderman, Ian Williams, Benjamin Lee, Deepti Bhatia, Justin Carlson, Semih Yavuz, Yu Zhang, Ian McGraw, Max Galkin, Qi Ge, Golan Pundak, Chad Whipkey, Todd Wang, Uri Alon, Dmitry Lepikhin, Ye Tian, Sara Sabour, William Chan, Shubham Toshniwal, Baohua Liao, Michael Nirschl, Pat Rondon

Lingvo is a Tensorflow framework offering a complete solution for collaborative deep learning research, with a particular focus towards sequence-to-sequence models.

Sequence-To-Sequence Speech Recognition

Weakly Supervised Grammatical Error Correction using Iterative Decoding

no code implementations31 Oct 2018 Jared Lichtarge, Christopher Alberti, Shankar Kumar, Noam Shazeer, Niki Parmar

We describe an approach to Grammatical Error Correction (GEC) that is effective at making use of models trained on large amounts of weakly supervised bitext.

14 Grammatical Error Correction

No Need for a Lexicon? Evaluating the Value of the Pronunciation Lexica in End-to-End Models

no code implementations5 Dec 2017 Tara N. Sainath, Rohit Prabhavalkar, Shankar Kumar, Seungji Lee, Anjuli Kannan, David Rybach, Vlad Schogol, Patrick Nguyen, Bo Li, Yonghui Wu, Zhifeng Chen, Chung-Cheng Chiu

However, there has been little previous work comparing phoneme-based versus grapheme-based sub-word units in the end-to-end modeling framework, to determine whether the gains from such approaches are primarily due to the new probabilistic model, or from the joint learning of the various components with grapheme-based units.

Language Modelling

Lattice Rescoring Strategies for Long Short Term Memory Language Models in Speech Recognition

no code implementations15 Nov 2017 Shankar Kumar, Michael Nirschl, Daniel Holtmann-Rice, Hank Liao, Ananda Theertha Suresh, Felix Yu

Recurrent neural network (RNN) language models (LMs) and Long Short Term Memory (LSTM) LMs, a variant of RNN LMs, have been shown to outperform traditional N-gram LMs on speech recognition tasks.

Speech Recognition

NN-grams: Unifying neural network and n-gram language models for Speech Recognition

no code implementations23 Jun 2016 Babak Damavandi, Shankar Kumar, Noam Shazeer, Antoine Bruguier

The model is trained using noise contrastive estimation (NCE), an approach that transforms the estimation problem of neural networks into one of binary classification between data samples and noise samples.

Speech Recognition

Multilingual Open Relation Extraction Using Cross-lingual Projection

no code implementations HLT 2015 Manaal Faruqui, Shankar Kumar

Open domain relation extraction systems identify relation and argument phrases in a sentence without relying on any underlying schema.

Relation Extraction

Cannot find the paper you are looking for? You can Submit a new open access paper.