Search Results for author: W. Ronny Huang

Found 24 papers, 8 papers with code

Analyzing the effect of neural network architecture on training performance

no code implementations • ICML 2020 • Karthik Abinav Sankararaman, Soham De, Zheng Xu, W. Ronny Huang, Tom Goldstein

Through novel theoretical and experimental results, we show how the neural net architecture affects gradient confusion, and thus the efficiency of training.

Paper
Add Code

Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study

no code implementations • 23 Jan 2024 • W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath

In the era of large models, the autoregressive nature of decoding often results in latency serving as a significant bottleneck.

Language Modelling Large Language Model +2

Paper
Add Code

Large-scale Language Model Rescoring on Long-form Data

no code implementations • 13 Jun 2023 • Tongzhou Chen, Cyril Allauzen, Yinghui Huang, Daniel Park, David Rybach, W. Ronny Huang, Rodrigo Cabrera, Kartik Audhkhasi, Bhuvana Ramabhadran, Pedro J. Moreno, Michael Riley

In this work, we study the impact of Large-scale Language Models (LLM) on Automated Speech Recognition (ASR) of YouTube videos, which we use as a source for long-form ASR.

Language Modelling speech-recognition +1

Paper
Add Code

Semantic Segmentation with Bidirectional Language Models Improves Long-form ASR

no code implementations • 28 May 2023 • W. Ronny Huang, Hao Zhang, Shankar Kumar, Shuo-Yiin Chang, Tara N. Sainath

We address this limitation by distilling punctuation knowledge from a bidirectional teacher language model (LM) trained on written, punctuated text.

Language Modelling Semantic Segmentation +1

Paper
Add Code

E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model

no code implementations • 28 Nov 2022 • W. Ronny Huang, Shuo-Yiin Chang, Tara N. Sainath, Yanzhang He, David Rybach, Robert David, Rohit Prabhavalkar, Cyril Allauzen, Cal Peyser, Trevor D. Strohman

We explore unifying a neural segmenter with two-pass cascaded encoder ASR into a single model.

Vocal Bursts Valence Prediction

Paper
Add Code

Modular Hybrid Autoregressive Transducer

no code implementations • 31 Oct 2022 • Zhong Meng, Tongzhou Chen, Rohit Prabhavalkar, Yu Zhang, Gary Wang, Kartik Audhkhasi, Jesse Emond, Trevor Strohman, Bhuvana Ramabhadran, W. Ronny Huang, Ehsan Variani, Yinghui Huang, Pedro J. Moreno

In this work, we propose a modular hybrid autoregressive transducer (MHAT) that has structurally separated label and blank decoders to predict label and blank distributions, respectively, along with a shared acoustic encoder.

Language Modelling speech-recognition +1

Paper
Add Code

E2E Segmenter: Joint Segmenting and Decoding for Long-Form ASR

no code implementations • 22 Apr 2022 • W. Ronny Huang, Shuo-Yiin Chang, David Rybach, Rohit Prabhavalkar, Tara N. Sainath, Cyril Allauzen, Cal Peyser, Zhiyun Lu

Improving the performance of end-to-end ASR models on long utterances ranging from minutes to hours in length is an ongoing challenge in speech recognition.

Sentence speech-recognition +1

Paper
Add Code

Detecting Unintended Memorization in Language-Model-Fused ASR

no code implementations • 20 Apr 2022 • W. Ronny Huang, Steve Chien, Om Thakkar, Rajiv Mathews

End-to-end (E2E) models are often being accompanied by language models (LMs) via shallow fusion for boosting their overall quality as well as recognition of rare words.

Language Modelling Memorization

Paper
Add Code

Sentence-Select: Large-Scale Language Model Data Selection for Rare-Word Speech Recognition

no code implementations • 9 Mar 2022 • W. Ronny Huang, Cal Peyser, Tara N. Sainath, Ruoming Pang, Trevor Strohman, Shankar Kumar

We down-select a large corpus of web search queries by a factor of 53x and achieve better LM perplexities than without down-selection.

Language Modelling Sentence +2

Paper
Add Code

Capitalization Normalization for Language Modeling with an Accurate and Efficient Hierarchical RNN Model

no code implementations • 16 Feb 2022 • Hao Zhang, You-Chi Cheng, Shankar Kumar, W. Ronny Huang, Mingqing Chen, Rajiv Mathews

Capitalization normalization (truecasing) is the task of restoring the correct case (uppercase or lowercase) of noisy text.

Federated Learning Language Modelling

Paper
Add Code

Scaling End-to-End Models for Large-Scale Multilingual ASR

no code implementations • 30 Apr 2021 • Bo Li, Ruoming Pang, Tara N. Sainath, Anmol Gulati, Yu Zhang, James Qin, Parisa Haghani, W. Ronny Huang, Min Ma, Junwen Bai

Building ASR models across many languages is a challenging multi-task learning problem due to large variations and heavily unbalanced data.

Multi-Task Learning

Paper
Add Code

Lookup-Table Recurrent Language Models for Long Tail Speech Recognition

no code implementations • 9 Apr 2021 • W. Ronny Huang, Tara N. Sainath, Cal Peyser, Shankar Kumar, David Rybach, Trevor Strohman

We introduce Lookup-Table Language Models (LookupLM), a method for scaling up the size of RNN language models with only a constant increase in the floating point operations, by increasing the expressivity of the embedding table.

Language Modelling Sentence +2

Paper
Add Code

GradInit: Learning to Initialize Neural Networks for Stable and Efficient Training

2 code implementations • NeurIPS 2021 • Chen Zhu, Renkun Ni, Zheng Xu, Kezhi Kong, W. Ronny Huang, Tom Goldstein

Innovations in neural architectures have fostered significant breakthroughs in language modeling and computer vision.

Ranked #137 on Image Classification on CIFAR-10

Image Classification Language Modelling +2

133

Paper
Code

Witches' Brew: Industrial Scale Data Poisoning via Gradient Matching

2 code implementations • ICLR 2021 • Jonas Geiping, Liam Fowl, W. Ronny Huang, Wojciech Czaja, Gavin Taylor, Michael Moeller, Tom Goldstein

We consider a particularly malicious poisoning attack that is both "from scratch" and "clean label", meaning we analyze an attack that successfully works against new, randomly initialized models, and is nearly imperceptible to humans, all while perturbing only a small fraction of the training data.

Data Poisoning

Paper
Code

MetaPoison: Practical General-purpose Clean-label Data Poisoning

2 code implementations • NeurIPS 2020 • W. Ronny Huang, Jonas Geiping, Liam Fowl, Gavin Taylor, Tom Goldstein

Existing attacks for data poisoning neural networks have relied on hand-crafted heuristics, because solving the poisoning problem directly via bilevel optimization is generally thought of as intractable for deep models.

AutoML Bilevel Optimization +2

Paper
Code

DeepErase: Weakly Supervised Ink Artifact Removal in Document Text Images

2 code implementations • NeurIPS Workshop Document_Intelligen 2019 • W. Ronny Huang, Yike Qi, Qianqian Li, Jonathan Degange

In addition to high segmentation accuracy, we show that our cleansed images achieve a significant boost in recognition accuracy by popular OCR software such as Tesseract 4. 0.

Optical Character Recognition Optical Character Recognition (OCR)

Paper
Code

Deep k-NN Defense against Clean-label Data Poisoning Attacks

1 code implementation • 29 Sep 2019 • Neehar Peri, Neal Gupta, W. Ronny Huang, Liam Fowl, Chen Zhu, Soheil Feizi, Tom Goldstein, John P. Dickerson

Targeted clean-label data poisoning is a type of adversarial attack on machine learning systems in which an adversary injects a few correctly-labeled, minimally-perturbed samples into the training data, causing a model to misclassify a particular test sample during inference.

Adversarial Attack Data Poisoning

Paper
Code

The Effect of Neural Net Architecture on Gradient Confusion & Training Performance

no code implementations • 25 Sep 2019 • Karthik A. Sankararaman, Soham De, Zheng Xu, W. Ronny Huang, Tom Goldstein

Through novel theoretical and experimental results, we show how the neural net architecture affects gradient confusion, and thus the efficiency of training.

Paper
Add Code

Understanding Generalization through Visualizations

2 code implementations • NeurIPS Workshop ICBINB 2020 • W. Ronny Huang, Zeyad Emam, Micah Goldblum, Liam Fowl, Justin K. Terry, Furong Huang, Tom Goldstein

The power of neural networks lies in their ability to generalize to unseen data, yet the underlying reasons for this phenomenon remain elusive.

Paper
Code

Transferable Clean-Label Poisoning Attacks on Deep Neural Nets

1 code implementation • 15 May 2019 • Chen Zhu, W. Ronny Huang, Ali Shafahi, Hengduo Li, Gavin Taylor, Christoph Studer, Tom Goldstein

Clean-label poisoning attacks inject innocuous looking (and "correctly" labeled) poison images into training data, causing a model to misclassify a targeted image after being trained on this data.

Transfer Learning

Paper
Code

The Impact of Neural Network Overparameterization on Gradient Confusion and Stochastic Gradient Descent

no code implementations • 15 Apr 2019 • Karthik A. Sankararaman, Soham De, Zheng Xu, W. Ronny Huang, Tom Goldstein

Our results show that, for popular initialization techniques, increasing the width of neural networks leads to lower gradient confusion, and thus faster model training.

Paper
Add Code

Accurate, Data-Efficient Learning from Noisy, Choice-Based Labels for Inherent Risk Scoring

no code implementations • 27 Nov 2018 • W. Ronny Huang, Miguel A. Perez

The data collection is synthetic; examples are crafted using optimal experimental design methods, obviating the need for real data which is often difficult to obtain due to regulatory concerns.

Experimental Design

Paper
Add Code

Are adversarial examples inevitable?

no code implementations • ICLR 2019 • Ali Shafahi, W. Ronny Huang, Christoph Studer, Soheil Feizi, Tom Goldstein

Using experiments, we explore the implications of theoretical guarantees for real-world problems and discuss how factors such as dimensionality and image complexity limit a classifier's robustness against adversarial examples.

Paper
Add Code

Poison Frogs! Targeted Clean-Label Poisoning Attacks on Neural Networks

4 code implementations • NeurIPS 2018 • Ali Shafahi, W. Ronny Huang, Mahyar Najibi, Octavian Suciu, Christoph Studer, Tudor Dumitras, Tom Goldstein

The proposed attacks use "clean-labels"; they don't require the attacker to have any control over the labeling of training data.

Data Poisoning Face Recognition +1

Paper
Code

Cannot find the paper you are looking for? You can Submit a new open access paper.