NeMo: a toolkit for building AI applications using Neural Modules

NeMo (Neural Modules) is a Python framework-agnostic toolkit for creating AI applications through re-usability, abstraction, and composition. NeMo is built around neural modules, conceptual blocks of neural networks that take typed inputs and produce typed outputs. Such modules typically represent data layers, encoders, decoders, language models, loss functions, or methods of combining activations. NeMo makes it easy to combine and re-use these building blocks while providing a level of semantic correctness checking via its neural type system. The toolkit comes with extendable collections of pre-built modules for automatic speech recognition and natural language processing. Furthermore, NeMo provides built-in support for distributed training and mixed precision on latest NVIDIA GPUs. NeMo is open-source https://github.com/NVIDIA/NeMo

PDF Abstract

Datasets


Results from the Paper


 Ranked #1 on Speech Recognition on Common Voice Spanish (using extra training data)

     Get a GitHub badge
Task Dataset Model Metric Name Metric Value Global Rank Uses Extra
Training Data
Result Benchmark
Speech Recognition Common Voice French ConformerCTC-L (4-gram) Test WER 9.16% # 2
Speech Recognition Common Voice French ConformerCTC-L (no-LM) Test WER 9.63% # 4
Speech Recognition Common Voice German ConformerCTC-L (4-gram) Test WER 6.03% # 5
Speech Recognition Common Voice German ConformerCTC-L (no LM) Test WER 6.68% # 9
Speech Recognition Common Voice Spanish ConformerCTC-L (4-gram) Test WER 5.5% # 1
Speech Recognition Common Voice Spanish ConformerCTC-L (no LM) Test WER 6.9% # 4

Methods


No methods listed for this paper. Add relevant methods here