We study model pruning methods applied to Transformer-based neural network language models for automatic speech recognition.
On-device automatic speech recognition systems face several challenges compared to server-based systems.
Virtual assistants make use of automatic speech recognition (ASR) to help users answer entity-centric queries.
In this work, we uncover a theoretical connection between two language model interpolation techniques, count merging and Bayesian interpolation.
no code implementations • • Volha Petukhova, Andrei Malchanau, Youssef Oualil, Dietrich Klakow, Saturnino Luz, Fasih Haider, Nick Campbell, Dimitris Koryzis, Dimitris Spiliotopoulos, Pierre Albert, Nicklas Linz, Alex, Jan ersson
The performance of Neural Network (NN)-based language models is steadily improving due to the emergence of new architectures, which are able to learn different natural language characteristics.
The goal of language modeling techniques is to capture the statistical and structural properties of natural languages from training corpora.
Training large vocabulary Neural Network Language Models (NNLMs) is a difficult task due to the explicit requirement of the output layer normalization, which typically involves the evaluation of the full softmax function over the complete vocabulary.
Feedforward Neural Network (FNN)-based language models estimate the probability of the next word based on the history of the last N words, whereas Recurrent Neural Networks (RNN) perform the same task based only on the last word and some context information that cycles in the network.
We augmented pre-trained word embeddings with these novel embeddings and evaluated on a rare word similarity task, obtaining up to 3 times improvement in correlation over the original set of embeddings.