no code implementations • ICML 2020 • Karthik Abinav Sankararaman, Soham De, Zheng Xu, W. Ronny Huang, Tom Goldstein
Through novel theoretical and experimental results, we show how the neural net architecture affects gradient confusion, and thus the efficiency of training.
no code implementations • 23 Jan 2024 • W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath
In the era of large models, the autoregressive nature of decoding often results in latency serving as a significant bottleneck.
no code implementations • 13 Jun 2023 • Tongzhou Chen, Cyril Allauzen, Yinghui Huang, Daniel Park, David Rybach, W. Ronny Huang, Rodrigo Cabrera, Kartik Audhkhasi, Bhuvana Ramabhadran, Pedro J. Moreno, Michael Riley
In this work, we study the impact of Large-scale Language Models (LLM) on Automated Speech Recognition (ASR) of YouTube videos, which we use as a source for long-form ASR.
no code implementations • 28 May 2023 • W. Ronny Huang, Hao Zhang, Shankar Kumar, Shuo-Yiin Chang, Tara N. Sainath
We address this limitation by distilling punctuation knowledge from a bidirectional teacher language model (LM) trained on written, punctuated text.
no code implementations • 28 Nov 2022 • W. Ronny Huang, Shuo-Yiin Chang, Tara N. Sainath, Yanzhang He, David Rybach, Robert David, Rohit Prabhavalkar, Cyril Allauzen, Cal Peyser, Trevor D. Strohman
We explore unifying a neural segmenter with two-pass cascaded encoder ASR into a single model.
no code implementations • 31 Oct 2022 • Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno
In this work, we propose a modular hybrid autoregressive transducer (MHAT) that has structurally separated label and blank decoders to predict label and blank distributions, respectively, along with a shared acoustic encoder.
no code implementations • 22 Apr 2022 • W. Ronny Huang, Shuo-Yiin Chang, David Rybach, Rohit Prabhavalkar, Tara N. Sainath, Cyril Allauzen, Cal Peyser, Zhiyun Lu
Improving the performance of end-to-end ASR models on long utterances ranging from minutes to hours in length is an ongoing challenge in speech recognition.
no code implementations • 20 Apr 2022 • W. Ronny Huang, Steve Chien, Om Thakkar, Rajiv Mathews
End-to-end (E2E) models are often being accompanied by language models (LMs) via shallow fusion for boosting their overall quality as well as recognition of rare words.
no code implementations • 9 Mar 2022 • W. Ronny Huang, Cal Peyser, Tara N. Sainath, Ruoming Pang, Trevor Strohman, Shankar Kumar
We down-select a large corpus of web search queries by a factor of 53x and achieve better LM perplexities than without down-selection.
no code implementations • 16 Feb 2022 • Hao Zhang, You-Chi Cheng, Shankar Kumar, W. Ronny Huang, Mingqing Chen, Rajiv Mathews
Capitalization normalization (truecasing) is the task of restoring the correct case (uppercase or lowercase) of noisy text.
no code implementations • 30 Apr 2021 • Bo Li, Ruoming Pang, Tara N. Sainath, Anmol Gulati, Yu Zhang, James Qin, Parisa Haghani, W. Ronny Huang, Min Ma, Junwen Bai
Building ASR models across many languages is a challenging multi-task learning problem due to large variations and heavily unbalanced data.
no code implementations • 9 Apr 2021 • W. Ronny Huang, Tara N. Sainath, Cal Peyser, Shankar Kumar, David Rybach, Trevor Strohman
We introduce Lookup-Table Language Models (LookupLM), a method for scaling up the size of RNN language models with only a constant increase in the floating point operations, by increasing the expressivity of the embedding table.
2 code implementations • NeurIPS 2021 • Chen Zhu, Renkun Ni, Zheng Xu, Kezhi Kong, W. Ronny Huang, Tom Goldstein
Innovations in neural architectures have fostered significant breakthroughs in language modeling and computer vision.
Ranked #145 on Image Classification on CIFAR-10
2 code implementations • ICLR 2021 • Jonas Geiping, Liam Fowl, W. Ronny Huang, Wojciech Czaja, Gavin Taylor, Michael Moeller, Tom Goldstein
We consider a particularly malicious poisoning attack that is both "from scratch" and "clean label", meaning we analyze an attack that successfully works against new, randomly initialized models, and is nearly imperceptible to humans, all while perturbing only a small fraction of the training data.
2 code implementations • NeurIPS 2020 • W. Ronny Huang, Jonas Geiping, Liam Fowl, Gavin Taylor, Tom Goldstein
Existing attacks for data poisoning neural networks have relied on hand-crafted heuristics, because solving the poisoning problem directly via bilevel optimization is generally thought of as intractable for deep models.
2 code implementations • NeurIPS Workshop Document_Intelligen 2019 • W. Ronny Huang, Yike Qi, Qianqian Li, Jonathan Degange
In addition to high segmentation accuracy, we show that our cleansed images achieve a significant boost in recognition accuracy by popular OCR software such as Tesseract 4. 0.
Optical Character Recognition Optical Character Recognition (OCR)
1 code implementation • 29 Sep 2019 • Neehar Peri, Neal Gupta, W. Ronny Huang, Liam Fowl, Chen Zhu, Soheil Feizi, Tom Goldstein, John P. Dickerson
Targeted clean-label data poisoning is a type of adversarial attack on machine learning systems in which an adversary injects a few correctly-labeled, minimally-perturbed samples into the training data, causing a model to misclassify a particular test sample during inference.
no code implementations • 25 Sep 2019 • Karthik A. Sankararaman, Soham De, Zheng Xu, W. Ronny Huang, Tom Goldstein
Through novel theoretical and experimental results, we show how the neural net architecture affects gradient confusion, and thus the efficiency of training.
2 code implementations • NeurIPS Workshop ICBINB 2020 • W. Ronny Huang, Zeyad Emam, Micah Goldblum, Liam Fowl, Justin K. Terry, Furong Huang, Tom Goldstein
The power of neural networks lies in their ability to generalize to unseen data, yet the underlying reasons for this phenomenon remain elusive.
1 code implementation • 15 May 2019 • Chen Zhu, W. Ronny Huang, Ali Shafahi, Hengduo Li, Gavin Taylor, Christoph Studer, Tom Goldstein
Clean-label poisoning attacks inject innocuous looking (and "correctly" labeled) poison images into training data, causing a model to misclassify a targeted image after being trained on this data.
no code implementations • 15 Apr 2019 • Karthik A. Sankararaman, Soham De, Zheng Xu, W. Ronny Huang, Tom Goldstein
Our results show that, for popular initialization techniques, increasing the width of neural networks leads to lower gradient confusion, and thus faster model training.
no code implementations • 27 Nov 2018 • W. Ronny Huang, Miguel A. Perez
The data collection is synthetic; examples are crafted using optimal experimental design methods, obviating the need for real data which is often difficult to obtain due to regulatory concerns.
no code implementations • ICLR 2019 • Ali Shafahi, W. Ronny Huang, Christoph Studer, Soheil Feizi, Tom Goldstein
Using experiments, we explore the implications of theoretical guarantees for real-world problems and discuss how factors such as dimensionality and image complexity limit a classifier's robustness against adversarial examples.
5 code implementations • NeurIPS 2018 • Ali Shafahi, W. Ronny Huang, Mahyar Najibi, Octavian Suciu, Christoph Studer, Tudor Dumitras, Tom Goldstein
The proposed attacks use "clean-labels"; they don't require the attacker to have any control over the labeling of training data.